Lecture Notes in 
Computer Science 1758 



Howard Heys Carlisle Adams (Eds.) 



Selected Areas 
in Cryptography 

6th Annual International Workshop, SAC’99 
Kingston, Ontario, Canada, August 1999 
Proceedings 







Springer 




Lecture Notes in Computer Science 1758 

Edited by G. Goos, J. Hartmanis and J. van Leeuwen 




Springer 

Berlin 

Heidelberg 

New York 

Barcelona 

Hong Kong 

London 

Milan 

Paris 

Singapore 

Tokyo 




Howard Heys Carlisle Adams (Eds.) 



Selected Areas 
in Cryptography 



6th Annual International Workshop, SAC’ 99 
Kingston, Ontario, Canada, August 9-10, 1999 
Proceedings 




Springer 




Series Editors 



Gerhard Goos, Karlsruhe University, Germany 
Juris Hartmanis, Cornell University, NY, USA 
Jan van Leeuwen, Utrecht University, The Netherlands 



Volume Editors 
Howard Keys 

Faculty of Engineering and Applied Science 

Memorial University of Newfoundland 

St. John’s, Newfoundland, Canada AIB 3X5 

E-mail: howard@engr.mun.ca 

Carlisle Adams 

Entrust Technologies 

750 Heron Road, Suite EOS 

Ottawa, Ontario, Canada KIV 1A7 

E-mail: cadams@entrust.com 



Cataloging-in-Publication Data applied for 

Die Deutsche Bibliothek - CIP-Einheitsaufnahme 

Selected areas in cryptography : 6th annual international workshop ; 
proceedings / SAC ’99, Kingston, Ontario, Canada, August 9-11, 
1999. Howard Heys ; Carlisle Adams (ed.). - Berlin ; Heidelberg ; New 
York ; Barcelona ; Hong Kong ; London ; Milan ; Paris ; Singapore ; 
Tokyo : Springer, 2000 

(Lecture notes in computer science ; Vol. 1758) 

ISBN 3-540-67185-4 



CR Subject Classification (1991): E.3, C.2, D.4.6, K.6.5, F.2.1-2, H.4.3 
ISSN 0302-9743 

ISBN 3-540-67185-4 Springer- Verlag Berlin Heidelberg New York 



This work is subject to copyright. All rights are reserved, whether the whole or part of the material is 
concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, 
reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication 
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, 
in its current version, and permission for use must always be obtained from Springer- Verlag. Violations are 
liable for prosecution under the German Copyright Law. 

Springer- Verlag is a company in the specialist publishing group BertelsmannSpringer 
© Springer-Verlag Berlin Heidelberg 2000 
Printed in Germany 

Typesetting: Camera-ready by author 

Printed on acid-free paper SPIN 10719627 57/3144 5 43 2 1 0 




Preface 



SAC’99 was the sixth in a series of annual workshops on Selected Areas in 
Cryptography. Previous workshops were held at Carleton University in Ottawa 
(1995 and 1997) and at Queen’s University in Kingston (1994, 1996, and 1998). 
The intent of the annual workshop is to provide a relaxed atmosphere in which 
researchers in cryptography can present and discuss new work on selected areas 
of current interest. The themes for the SAC’99 workshop were: 

— Design and Analysis of Symmetric Key Cryptosystems 

— Efficient Implementations of Cryptographic Systems 

— Cryptographic Solutions for Web/Internet Security 

The timing of the workshop was particularly fortuitous as the announcement 
by NIST of the five finalists for AES coincided with the first morning of the 
workshop, precipitating lively discussion on the merits of the selection! 

A total of 29 papers were submitted to SAC’99 and, after a review process 
that had all papers reviewed by at least 3 referees, 17 were accepted and pre- 
sented. As well, two invited presentations were given: one by Miles Smid from 
NIST entitled “From DES to AES: Twenty Years of Government Initiatives in 
Cryptography” and the other by Mike Reiter from Bell Labs entitled “Password 
Hardening with Applications to VPN Security” . 

The program committee for SAC’99 consisted of the following members: 
Carlisle Adams, Tom Cusick, Howard Heys, Lars Knudsen, Henk Meijer, Luke 
O’Connor, Doug Stinson, Stafford Tavares, and Serge Vaudenay. As well, addi- 
tional reviewers were: Christian Cachin, Louis Granboulan, Helena Handschuh, 
Julio Lopez Hernandez, Mike Just, Alfred Menezes, Serge Mister, Guillaume 
Poupard, Victor Shoup, Michael Wiener, and Robert Zuccherato. 

The organizers are very grateful for the financial support for the workshop 
received from Entrust Technologies, the Department of Electrical and Computer 
Engineering at Queen’s University, and Communications and Information Tech- 
nology Ontario (CITO). Special thanks to Stafford and Henk must be given for, 
once again, hosting SAC and being responsible for all the local arrangement de- 
tails. The organizers would also like to thank Sheila Hutchison of the Department 
of Electrical and Computer Engineering at Queen’s University for administra- 
tive and secretarial help and Yaser El-Sayed from the Faculty of Engineering 
at Memorial University of Newfoundland for help in preparing the workshop 
proceedings. 

On behalf of the SAC’99 organizing committee, we thank all the workshop 
participants for making SAC’99 a success! 
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Abstract. DES and triple-DES are two well-known and popular encryp- 
tion algorithms, but they both have the same drawback: their block size 
is limited to 64 bits. While the cryptographic community is working hard 
to select and evaluate candidates and finalists for the AES (Advanced 
Encryption Standard) contest launched by NIST in 1997, it might be of 
interest to propose a secure and simple double block-length encryption 
algorithm. More than in terms of key length and block size, our Uni- 
versal Encryption Standard is a new construction that remains totally 
compliant with DES and triple-DES specifications as well as with AES 
requirements. 



1 Introduction 

For many years, DES Q has been used as a worldwide encryption standard. But 
as technology improved for specialized key-search machines its 56-bit key 

size became too short, and a replacement was needed. 2-key triple-DES has since 
become the traditional block cipher used both by the cryptographic community 
as well as industry. However, there is a second drawback to DES which is also 
the case for triple-DES: its 64-bit block size. Therefore NIST launched a contest 
to select and evaluate candidates for a new encryption standard, the AES, in late 
1997 Q]. The basic requirements for this new algorithm were that it be at least 
as secure and fast as triple-DES, but that its block size be of 128 bits instead of 
64, and that its key size take possible values of 128, 192 and 256 bits. 

Meanwhile, people are still using DES and triple-DES, and may want to start 
developping applications where these two as well as the new AES may indepen- 
dently be used as the encryption components. In order to be compliant with DES 
and triple-DES, we propose a new construction which is based on these building 
blocks, but which can take AES specifications as a requirement for its key and 
block sizes. Therefore, when AES is finally selected, it will come as a natural 
plug-in replacement of the actual structure whithout anybody being forced to 
change input and output interfaces. 

We notice that double block-length encryption primitives based on DES already 
exist: as an example, take DEAL, which uses DES as the round function in 
a traditional 6-round Feistel scheme One can also think of multiple modes 

Howard Heys and Carlisle Adams (Eds.): SAC’99, LNCS 1758, pp. 1~^| 2000. 
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2 Helena Handschuh and Serge Vaudenay 

with two blocks, where DES is the underlying cipher but except for two-key 
triple DES in outer CBC mode which is vulnerable to dictionary and matching 
ciphertext attacks, none of these constructions are backward compliant with 
DES and triple-DES, nore do they make use of the full strength of a 128-bit 
block size (the second half of the plaintext never influences the first half of the 
ciphertext). Furthermore, multiple modes are either insecure or require 

confidentiality or integrity protected initial values We are also aware of 

the attacks by Lucks on 3-key triple DES and DEAL 

The rest of the paper is organized as follows: section 2 presents our new en- 
cryption standard. Sections 3 and 4 provide details on collision attacks when 
some of the components of our UES are cut out. Section 5 provides additional 
security arguments on our construction and evaluates its strength based on the 
EX construction. Finally, we argue why we believe our construction is sound. 

2 A Universal Encryption Standard 

In this section we give the specifications of our new double block-length en- 
cryption algorithm. It basically runs two triple-DES encryptions in parallel and 
exchanges some of the bits of both halves inbetween each of the three encryption 
layers. Note that Outerbridge proposed a similar idea We investigated sev- 
eral related constructions and decided to add pre and post-whitening with extra 
keys, as well as an additional layer where bits of the left and the right half of 
the scheme are swapped under control of the extended secret key. Justification 
for these final choices will be given throughout this paper. The key schedule is 
considered to be the same as DEAL’S. 

2.1 Notations 

We use the following notations for our scheme as well as for the attacks presented 
in the next sections (all operations are on bitstrings): 

a\b : concatenation of a and b 
a © 6 : bitwise “exclusive or” of a and b 
a A 5 : bitwise “and” of a and b 
a : bitwise 1-complement of a 
001110100111b : bitstring in binary notation 

3a?x : bitstring in hexadecimal notation with implicit length (multiple of four) 

In addition we let DESfc(a;) denote the DES encryption of a 64-bit block x 
by using a 56-bit key k, and we let 3DESfcj_fc2(a;) denote the 2-key triple-DES 
encryption of x in EDE mode (Encryption followed by Decryption followed by 
Encryption), i.e. 



3DESfc,.fc2(a:) = DESfc, (DES^^^ (DESfc,(U)) • 
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2.2 Basic Building Blocks 

We already mentioned that we use parallel 3DES as well as a kind of keyed swap. 
In order to further formalize our proposal, let us define the following three basic 
building blocks which refer to operations on 128-bit strings. For convenience, we 
split a 128-bit string x into two 64-bit halves Xh and xi. 

1. Keyed Translation. Let k = kh\ki he a 128-bit string. We define 

Tk{x) = x(B k. 

2. Keyed Swap. Let fc be a 64-bit string. We define 

Sk{x) = {xh © u)\{xi © u) 

where u = {xu © xi) A k. This actually consists of exchanging the bits which 
are masked by k in the two halves. 

3. Parallel Encryption. Let k = kh\ki be two concatenated keys for two 
keyed algorithms C and C. We define 

Pk,c,C'{x) = Ck^{xh)\C'^.^{xi). 

Our algorithm is a combination of three rounds of products of these transfor- 
mations with additional operations before the first and after the last encryption 
layer. 

2.3 Our New DES and 3DES-Compliant Construction 

Having defined the above components, let m = OOOOOOOOf f f f f f f fx, and let 
k' = fci 1^2 1^3 1^4 and m' = mi|m 2 |m 3 |m 4 be respectively two 256-bit extended 
keys derived from k by the key schedule. 

Definition 1. 



UESfc — Pfci|fc3,DES,DES ° Sm° ^’fc2|fc4,DES-LDES-i ° Sm ° Pfcilfcg.DES.DES 

See figure 1. Then the precise formula to encrypt a plaintext under key k 
using UES reads as follows: 

Definition 2. 



UESfc = o r^3|^3 o UES^ o o 5^3 

See figure 2. This algorithm has two interesting properties. Namely if we set 
m' = 0 and k' = k, we have 

Property 1. 

UESfcj 1^2 (a;/ |a;/) UESj, 3 |j, 3 |j, 3 |j, 3 (a;/|a;/) 3DESfcj^fc2(a;/)|3DESfc3^fc2(a;/) 



and 
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Xh 



Xi 




Vh 



yi 



Fig. 1. UES*: Double-block length parallel triple DES 



Property 2. 

UESfc,|fc,|fc,|fci(a;/|a;/) = UESfc^|fc^|fc^|fc^(a;/|a;/) = DESfci(a;/)|DESfci(a;/). 

In addition it operates on 128-bit block messages. This makes the algorithm 
compatible with the forthcoming AES, and usable in DES or triple-DES mode. 
Finally, if we set m = 0, we can even run two full DES or 3DES encryptions in 
parallel, which doubles the encryption speed (two blocks are encrypted applying 
UES* only once). 

Note that this scheme enables to construct double block-length encryption algo- 
rithms no matter what the underlying cipher is. For simplicity throughout this 
paper we will consider DES, but any other secure 64-bit block cipher could do 
the job. We will also focus on generic attacks that do not exploit the internal 
structure of the component encryption algorithm. Specific attacks such as dif- 
ferential B or linear cryptanalysis truncated or higher order differentials 
^3 do not apply in this context as at least three layers of basic encryption are 
applied. We also believe that the best way to attack the scheme by a generic 
method is to try to create inner collisions. 
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Fig. 2. Encryption with UES. 
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2.4 The Key-Schedule 

In Table 1 below, we summarize in which different modes UES may be used. 



Mode 


DES 


3DES 


AES 


Key size 


56 


112 


128/192/256 


Block size 


k' = k\k\k\k 


II 


- 


64 bits 


m' = 0, m — 0 


m' = 0, m — 0 




Block size 


k' = k\k\k\k 


K 

II 

?r 


k' = fci 1 ^2 1 fca 1 k4 


128 bits 


m! = 0,Xh = xi 


m = 0,Xh = xi 


m' = mi|m2|m3|m4 



Table 1. Key-schedule for DES, 3DES and AES modes 



The four subkeys and the four submasks used in AES-mode are derived from 
the user key using DEAL’S key-schedule (for a 256-bit key). The user key is 
first divided into s subkeys of 64 bits each for s = 2, 3, 4. Then expand these s 
keys to 8 keys by repetition and exor the keys with a new constant for every 
repetition. Encrypt the expanded list of keys using DES in CBC mode with a 
fixed key K = 0123456789abcdefx and with the initial value set to zero. In 
order to partially allow on the fly key generation, start by deriving mi and m 2 , 
next derive the four DES keys forming k' , and finally derive m 3 and 7714 . 

We are aware of Kelsey and Schneier’s ^3 key-schedule cryptanalysis of 
DEAL. It turns out UES may have a very small class of equivalent keys in the 
192-bit key case, because of the use of 56-bit keys for the inner DES blocks, 
whereas 64 bit subkeys are generated by the key-schedule. We also worked out 
a similar related-key attack with John Kelsey, which recovers the keys in com- 
plexity 2®"^ using 2®® related keys. However, these attacks apply in a very limited 
number of practical settings. Developpers should still make sure an attacker is 
not allowed to choose the keys in such a way. 

3 Collision Attacks on Parallel DES 

In this section, we consider the variant of UES previously defined as: 

UESfc = Ufci|fc3,DES,DES ° ^’fcjlfci.DES-bDES-i ° ‘S'm ° ^’fcilfcg.DES.DES 

We will show that this straightforward way of doubling the block size is not 
secure because a collision attack can be mounted against it (this phenomenon 
has been independently observed by Knudsen B3)’ This is due to the fact that 
the construction is not a multipermutation. In other words, it may very well 
happen that if half of the input bits have a fixed value, half of the output bits 
also have, which would not be the case if the multipermutation property had 
been satisfied {J. However, our intention is to prove that we can nevertheless 
use the structure if the input and output bits to this variant are unknown to 
the attacker. Therefore we begin by showing where the problem comes from, 
and justify our additional layers of swapping and masking in the final version of 
UES. 
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3.1 Public Intermediate Swapping 

We first show how to break UES* by recovering the secret key with about 2^^ 
chosen plaintexts, 2®® DES operations, and a memory of 16GB. The attack 
consists in the following steps. 

Step 1. First fix = a to a constant and try many xi = Ui values for i = 
1, . . . , n. Request yi\zi = UES*(a|ui). 

Step 2. For any collision yi = yj, guess that this comes from collisions on 
the two inputs of the second and third internal DES higher permutations 
(these are called “good collisions” ) . The expected number of good collisions 
is and the expected number of “natural” (bad) collisions is the same. 

Try all possible ks until there is a collision on both 

DESfc3(xi) Am = DESfc3(xj) A m 

and 

DES)T^^(zi) A m = DES)T^^(2j) A m. 

Note that a single (good) collision will always suggest the good fcs value and 
an expected number of 2“® random ones, and a bad collision will suggest 
2“® random values on average. It is thus likely that we get the ks key once 
a fca value is suggested twice (namely with a confidence of 2^® : 1). We thus 
need only two good collisions. This requires n « 2®®. 

Step 3. Perform a similar attack on ki . 

Step 4. Recover k 2 , then k^ by exhaustive search. 

This shows that this algorithm is just a little more secure than DES, and far 
less secure than triple-DES. We add that Bart Preneel pointed out to us that 
in Step 1, a collision on the other half of the ciphertext occurs with the same 
probability, therefore we get an extra condition satisfied by fca as well as ki from 
the same number of chosen plaintexts. This slightly decreases the number of 
required chosen plaintexts. 

As a matter of fact, the previous attack holds whenever m is any other public 
value. Namely let w denote the Hamming weight of m. Without loss of generality, 
let us assume that w < 32 (otherwise, let us consider the lower DES“^ opera- 
tion). In the attack above, the number of expected good collisions is 71^.2“®™“^, 
and the number of bad collisions is The attack thus still holds but with 

a complexity of n « 2™+^. In general, the complexity is thus n « 

Actually, the complexity is the highest for UES*, because m has a balanced 
Hamming weight. (Note that this analysis does not hold if m = 0 or m = 0^=®^ 
for which we have two triple-DES in parallel, and no possible collision.) 

3.2 Keyed Intermediate Swapping 

At first sight, one might think that introducing a keyed inner swap significantly 
increases the complexity of the attack. However this is not the case. Let us show 
why. 
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If m is a part of the key, we cannot exhaustively search for fca because we 
do not know m. In the worst case {w = 32) we have 2 good collisions and 2 bad 
ones. Let us assume we have guesses which are the good ones (this may lead 
to an overhead factor of 16). For each possible we can look on which bits 
DESfc 3 (a;i) and DESfc 3 (a;j) collide, as well as DES^^^(zi) and DES^^^(zj). For the 
good fca we will find w + bits on average with a standard deviation of 

-\J (64 — w)^. For a random ks we will find 16 ± 2-\/3 bits (which means that 16 

is the average and 2-\/3 the standard deviation). In order to simplify the analysis, 
let us consider the worst case where w = 32. So for the right key fca and for a 
good collision, the number of colliding bits is 40 ± -s/h- 

Now for each possible ks count the collisions that have say more than t = 30 
bits as they will much more likely result from the right key value rather than from 
a random key value. Then the key guess associated to the most such collisions 
will be the right one with high probability. 

Let 

‘fix) = ( 1 ) 

V J — <x> 



Then as a matter of fact, the average number Ct of collisions on more than 
t bits is 

Ct = n^2"®®(l -if)± n.2-^^-V</5(l - <p) (2) 



where (/? = (/?!=(/? 
value, and = (^2 = 




1 — 10 ^ ® for t = 30 in case of a random key 
2-15.8 correct key guess. 



So for n = 2^^ the average number of such collisions is 2“® ® for a wrong 
guess, whereas it is about 4 for the right key. 

As a result of this enhaced collision attack, we chose not to use any keyed 
inner swapping as it unnecessarily complicates the design (also more key materiel 
is needed) without significantly increasing its security. Instead of this, we chose 
to add the features described hereafter. 



4 Introducing Pre- and Post-whitening 

The scheme of the previous section is compliant with DES and 3DES, but is not 
secure enough against key recovery attacks. So the next most straightforward 
idea is to protect against the exhaustive key search (once a collision is found) 
by adding whitening keys before and after the current structure. This consid- 
erably increases the work factor and derives from a principle discussed in the 
construction of DESX 

Let us define this new variant of UES by: 



UEsr = i;„3|m3 oUES* o 



Definition 3. 
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The complexity of exhaustive key search increases to about = 2 ^^® of- 

fline encryptions given one single collision due to the DESX phenomenon (other 
trade-offs can be achieved if more than one collision is available). However, this 
variant can still be distinguished from a random permutation by the previous 
attack because collisions are far more likely in this setting (they occur twice more 
often) . Note that a collision on one half of the output of UES* leads to a collision 
on the same half of the ciphertext because the value m3 is kept constant. There- 
fore with the same complexity as the collision attack, this second variant may 
be distinguished from a random permutation, which is not a desirable feature. 

5 On the Importance of Pre- and Post-swapping 

Having solved the key recovery problem, we still face the distinguisher problem. 
Therefore the next and last step towards our final UES is to add yet another 
layer of swapping, but this time under control of the secret key. We will first 
show how the addition of m4 increases the complexity of the collision attack, 
and next which additional workfactor is introduced by mi . 



5.1 Keyed Swap of the Output 

Adding a keyed post-swapping, our current structure becomes: 

Definition 4. 

UESr* = SmA o UES** = 5„4 o r^3|^3 O UES* o 

In order to build the same attack as the distinguisher of the previous section 
(key recovery is hopeless by now), we need to create collisions on one half of 
the ouput. However, this time these collisions are much harder to spot, as we do 
not know which output bits correspond say to the left half of the output of the 
butterfly structure. 

Nevertheless, we can still use a property of Smt- Namely, if Xh = x'j^, we let 
A = Srm (x) 0 Sm4 {x') and it holds that: 

1 . AhAAi = 0 

2. {Ah)i < (m4)i 

3. {Ai)i < (mj)i 

for any bit i = 1 , . . . , 64 . 

Proof. Let yh = (S^^ (x))^, y/ = (S'm^ (x)),, y), = (S'm^ (a;')),j and y( = (5^4 (a:'))/- 
Then we have the following results: 

1 . When Xh = x'f^, the following relations hold: 

Vh © l/h = i.xi A m4) © {x'l A m4) = {xi © x[) A m4 

y/ © y) = {.Xi A m4) © {x'l A m4) © a;/ © a;J = {xi © x'l) A mj 

Thus 

AhAAi = {{xi © x'l) A 1714) A {{xi © x'l) A mj) = 0 




10 



Helena Handschuh and Serge Vaudenay 



2. Ah = {xi 0 x'l) A m 4 so when {mi)^ = 0 , {Ah)i = 0 and when ( 7714 )^ = 1 , 
{Ah)i = 0 or 1 . The result follows. 

3. The third point is symmetric to the second. 

Given the above properties, the attack consists in the following steps. 

Step 1. First fix = a to a constant and try many xi = Ui values for i = 
1, . . .,n. Request yi = UES***(a|ui). 

Step 2. For all (i, j) pairs, let 0 yj = z\t. If we have z At = 0, guess that we 
have a collision as in the above attack. Guess that 7774 is an intermediate mask 
between z and t. (If Zi = 1 , then (7744)7 = Ij and if U = 1 , then (7744)7 = 0 -) 
Thus the expected number of good events (the signal) is 77^.2“^™“^ and the 
number of bad events (the noise) is (|) « 77^.2“^^-®. One good event 

suggests 2/3 of the bits of 7744 on average. One bad event also suggests 2/3 
of the bits of 7744 on average, but in a random way so that we can check 
for consistency. It is thus likely that we recover the right mask 7744 within 
four good events (mask bits suggested by bad events will happen to be 
inconsistent with the good ones with high probability). 

Step 3. From 7744 and the above collisions, apply the same attack as before to 
distinguish UES*** from a random permutation. 

Since we need four good events to occur, in the worst case (w = 32) we need n « 
2^^-®, and considering all (z, j) pairs in Step 2 leads to a complexity of 2®^. (These 
are however very simple tests, so this complexity can actually be compared to an 
exhaustive search for DES.) We can expect to get ^ (|)^^ ~ 2®®-^ bad events 
on average. Each event suggests a pattern for 7744 with determined bits (0 or 1 ), 
and undetermined ones. A pair of events may thus be consistent with probability 
(1)®^ « We can thus expect to find 2 ^^ ®. 2“®® ® = 2®^ ® consistent pairs 

of bad events. More generally speaking, we are looking for multiple events in 
which each pair is consistent with a unique mask 7744. This is the same problem as 
seeking fc-cliques in the consistency graph of the bad events. Any fc-clique will be 

consistent with probability there are exactly ^ such cliques. 

Therefore, when k gets larger (fc > 11 ), no fc-clique in the consistency graph 
will survive this filtering process. The complexity of this algorithm is subject to 
combinatorial optimizations, and we believe that the bottleneck complexity will 
actually come from the exhaustive search of DES keys. 

5.2 Keyed Swap of the Input 

Taking into account what we just saw, our final construction must make it as 
hard as possible for the attacker to find the required 2®^ different values entering 
say the right half of the structure. Therefore all we have to do is make it hard to 
find 34 bits entering the right half (else the attacker tries all the values of these 
34 bits and keeps the rest constant which leads to the above result. 

Adding a final extra layer of keyed swapping on input to the structure will 
lead to this result. The attacker now has to guess 34 bits that enter one half 
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in order to subsequently attack 1 TI 4 and then be able to distinguish UES from 
a random permutation. The extra work factor is thus ( 3 ^ 4 *) 7 ( 34 ) which is 243 
Then the total complexity is = 2^®. 

We believe this security level is acceptable. 



5.3 Other Alternatives 

If the security level is still a concern to some people, one might also consider 
replacing the keyed outer swapping by a keyed permutation at the bit level. 
This will add a bit more complexity again. The attacker will have yet a harder 
time finding which 35 bits enter say the left part of the structure. However, bit 
permutations are very costly in terms of speed, therefore this alternative shall 
only be considered if execution time is not that much an issue. 

We also considered byte permutations, but these are far too trivial to attack. 
The overhead complexity is only about 2^^. 

Our final construction is therefore UES with keyed outer swapping and 
whitening ” a la DESX” . 



6 Conclusion 

We have investigated several variants of a double-block length encryption scheme 
based on DES which is compliant with DES, 3DES as well as with AES spec- 
ifications. This may be useful for applications where DES or 3DES are still in 
use, but where people start to think about double block length and key sizes. 
Once the final AES is chosen and becomes a standard, it can be plugged into 
applications in place of our scheme with ease. Among several variants, we se- 
lected the best one in terms of security and simplicity, and showed that there is 
no practical attack that can endanger our scheme. Key recovery does not seem 
possible and in order to distinguish this new cipher from a random permutation, 
the workload is basically very high. 
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Abstract. We describe the design of Yarrow, a family of cryptographic 
pseudo-random number generators (PRNG). We describe the concept of 
a PRNG as a separate cryptographic primitive, and the design principles 
used to develop Yarrow. We then discuss the ways that PRNGs can fail 
in practice, which motivates our discussion of the components of Yarrow 
and how they make Yarrow secure. Next, we define a specific instance 
of a PRNG in the Yarrow family that makes use of available technology 
today. We conclude with a brief listing of open questions and intended 
improvements in future releases. 



1 Introduction 

Random numbers are critical in every aspect of cryptography. Cryptographers 
design algorithms such as RC4 and DSA, and protocols such as SET and SSL, 
with the assumption that random numbers are available. Even as straightforward 
an application as encrypting a file on a disk with a passphrase typically needs 
random numbers for the salt to be hashed in with the passphrase and for the 
initialization vector (IV) used in encrypting the file. To encrypt e-mail, digitally 
sign documents, or spend a few dollars worth of electronic cash over the internet, 
we need random numbers. 

Specifically, random numbers are used in cryptography in the following ap- 
plications: 

— Session and message keys for symmetric ciphers, such as triple-DES or Blow- 
fish. 

— Seeds for routines that generate mathematical values, such as large prime 
numbers for RSA or ElGamal-style cryptosystems. 

— Salts to combine with passwords, to frustrate offline password guessing pro- 
grams. 

— Initialization vectors for block cipher chaining modes. 

— Random values for specific instances of many digital signature schemes, such 
as DSA. 

— Random challenges in authentication protocols, such as Kerberos. 

Howard Heys and Carlisle Adams (Eds.): SAC’99, LNCS 1758, pp. 13-^^2000. 

@ Springer-Verlag Berlin Heidelberg 2000 
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— Nonces for protocols, to ensure that different runs of the same protocol are 

unique; e.g., SET and SSL. 

Some of those random numbers will be sent out in the clear, such as I Vs and 
random challenges. Other of those random numbers will be kept secret, and used 
as keys for block ciphers. Some applications require a large quantity of random 
numbers, such as a Kerberos server generating thousands of session keys every 
hour, and others only a few. In some cases, an attacker can even force the random 
generator to generate thousands of random numbers and send them to him. 

Unfortunately, random numbers are very difficult to generate, especially on 
computers that are designed to be deterministic. We thus fall back on pseudo- 
randon|numbers. These are numbers that are generated from some (hopefully 
random) internal values, and that are very hard for an observer to distinguish 
from random numbers. 

Given the importance of generating pseudo-random numbers for crypto- 
graphic applications, it is somewhat surprising that little formal cryptanalysis 
of these generators exist. There are methodologies for generating randomness 
on computer systems and ad hoc designs of generators , 

but we are aware of only one paper cryptanalyzing these designs 



1.1 What Is a Cryptographic PRNG? 



In our context, a random number is a number that cannot be predicted by an 
observer before it is generated. If the number is to be in the range 0. . .2" — 1, 
an observer cannot predict that number with probability any better than 1/2". 
If m random numbers are generated in a row, an observer given any m — 1 of 
them still cannot predict the m’th with any better probability than 1/2". More 
technical definitions are possible, but they amount to the same general idea. 

A cryptographic pseudorandom number generator, or PRNG, is a crypto- 
graphic mechanism for processing somewhat-unpredictable inputs, and generat- 
ing pseudorandom outputs. If designed, implemented, and used properly, even 
an attacker with enormous computational resources should not be able to dis- 
tinguish a sequence of PRNG outputs from a random sequence of bits. 

There are a great many PRNGs in use in cryptographic applications. Some 
of them (such as Peter Gutmann’s PRNG in Gryptlib or Golin Plumb’s 

PRNG in PGP ^^^H)areapparently pretty well designed. Others (such as the 
RSAREF 2.0 PR^gT^H, or the PRNG specified in ANSI X9.17 
are appropriate for some applications, but fail badly when used in other appli- 
cations 



^ It is important to distinguish between the meaning of pseudorandom numbers in 
normal programming contexts, where these numbers merely need to be reasonably 
random-looking, and in the context of cryptography, where these numbers must 
be indistinguishable from real random numbers, even to observers with enormous 
computational resources. 
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A PRNG can be visualized as a black box. Into one end flow all the internal 
measurements (samples) which the system designer believed might be unpre- 
dictable to an attacker. Out of the other end, once the PRNG believes it is in 
an unguessable state, flow apparently random numbers. An attacker might con- 
ceivably have some knowledge or even control over some of the input samples to 
the PRNG. An attacker might have compromised the PRNG’s internal state at 
some point in the past. An attacker might have an extremely good model of the 
“unpredictable” values being used as input samples to the PRNG, and a great 
deal of computational power to throw at the problem of guessing the PRNG’s 
internal state. 

Internally, a PRNG needs to have a mechanism for processing those (hope- 
fully) unpredictable samples, a mechanism for using those samples to update 
its internal state, and a mechanism to use some part of its internal state to 
generate pseudorandom outputs. In some PRNG designs, more-or-less the same 
mechanism does all three of these tasks; in others, the mechanisms are clearly 
separated. 



1.2 Why Design a New PRNG? 

We designed Yarrow because we are not satisfied with existing PRNG designs. 
Many have flaws that allowed attacks under some circumstances (see 
for details on many of these). Most of the others do not seem to have been 
designed with attacks in mind. None implement all the defenses we have worked 
out over the last two years of research into PRNGs. 

Yarrow is an enhancement of a proprietary PRNG we designed several years 
ago for a client. We kept improving our design as we discovered new potential 
attacks. 



1.3 A Guide to the Rest of the Paper 

The remainder of this paper is as follows: In Section we discuss the reasons 
behind our design choices for Yarrow. In Section J we discuss the various ways 
that cryptographic PRNGs can fail in practice. Then, in Section^ we will discuss 
the basic components of Yarrow, and show how they resist the kinds of failures 
listed earlier. Section Ogives the generic design ideas and their rationale. Finally, 
we will consider open questions relating to Yarrow, and plans for future releases. 

In the full paper we will define Yarrow-160, a precisely defined PRNG, and 
discuss entropy calculation. 

2 Yarrow Design Principles 

Our goal for Yarrow is to make a PRNG that system designers can fairly easily 
incorporate into their own systems, and that is better at resisting the attacks 
we know about than the existing, widely-used alternatives. 

We pose the following constraints on the design of Yarrow: 
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1. Everything is reasonably efficient. There is no point in designing a PRNG 
that nobody will use, because it slows down the application too much. 

2. Yarrow is so easy to use that an intelligent, careful programmer with no 
background in cryptography has some reasonable chance of using the PRNG 
in a secure way. 

3. Where possible. Yarrow re-uses existing building blocks. 

Yarrow was created using an attack-oriented design process. This means we 
designed the PRNG with attacks in mind from the beginning. Block ciphers 
are routinely designed in this way, with structures intended to optimize their 
strength against commonly-used attacks such as differential and linear crypt- 
analysis. The Yarrow design was very much focused on potential attacks. This 
had to be tempered with other design constraints: performance, flexibility, sim- 
plicity, ease of use, portability, and even legal issues regarding the exportability 
of the PRNG were considered. The result is still a work-in-progress, but it resists 
every attack of which we are aware, while still being a usable tool for system 
designers. 

We spent the most time working on a good framework for entropy-estimation 
and reseeding, because this is so critical for the ultimate security of the PRNG, 
and because it is so often done badly in fielded systems. Our cryptographic 
mechanisms are nothing very exciting, just various imaginative uses of a hash 
function and a block cipher. However, they do resist known attacks very well. 



2.1 Terminology 

At any point in time, a PRNG contains an internal state that is used to generate 
the pseudorandom outputs. This state is kept secret and controls much of the 
processing. Analogous to ciphers we call this state the key of the PRNG. 

To update the key the PRNG needs to collect inputs that are truly random, 
or at least not known, predictable or controllable by the attacker. Often used 
examples include the exact timing of key strokes or the detailed movements of 
the mouse. Typically, there are a fairly large number of these inputs over time, 
and each of the input values is fairly small. We call these inputs the samples. 

In many systems there are several sources that each produce samples. We 
therefore classify the samples according to the source they came from. 

The process of combining the existing key and new sample(s) into a new key 
is called the reseeding. 

If a system is shut down and restarted, it is desirable to store some high- 
entropy data (such as the key) in non-volatile memory. This allows the PRNG 
to be restarted in an unguessable state at the next restart. We call this stored 
data the seed file. 



3 How Cryptographic PRNGs Fail 

In this section, we consider some of the ways that a PRNG can fail in a real- 
world application. By considering how a PRNG can fail, we are able to recognize 
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ways to prevent these failures in Yarrow. In other cases, the failures cannot be 
totally prevented, but we can make them less likely. In still other cases, we can 
only ensure a quick recovery from the compromised state. 

3.1 How PRNGs Are Compromised 

Once the key of a PRNG is compromised, its outputs are predictable; at least 
until it gets enough new samples to derive a new, unguessable key. Many PRNGs 
have the property that, once compromised, they will never recover, or they will 
recover only after a very long time. 

For these reasons, it makes sense to consider how a PRNG’s key can be 
compromised, and how, once keys are compromised, they may be exploited. 



Entropy Overestimation and Guessable Starting Points. We believe that 
this is the most common failing in PRNGs in real-world applications. It is easy 
to look at a sequence of samples that appears random and has a total length 
of 128 bits, feed it into the PRNG, and then start generating output. If that 
sequence of samples turns out only to have 56 bits of entropy, then an attacker 
could feasibly perform an exhaustive search for the starting point of the PRNG. 

This is probably the hardest problem to solve in PRNG design. We tried to 
solve it by making sure that the entropy estimate is very conservative. While it is 
still possible to seriously overestimate the starting entropy, it is much less likely 
to happen, and when it does the estimate is likely to be closer to the actual 
value. We also use a computationally-expensive reseeding process to raise the 
cost of attempting to guess the PRNG’s key. 



Mishandling of Keys and Seed Files. Keys and seed files are easy to mis- 
handle in various ways, such as by letting them get written to the swap file by 
the operating system, or by opening a seed file, but failing to update it every 
time it is used. The Yarrow design provides some functions to simplify the man- 
agement of seed files. An excellent discussion of some methods for avoiding key 
compromise appears in 



Implementation Errors. Another way that the key of the PRNG can be 
compromised is by exploiting some implementation error. Errors in the imple- 
mentation are impossible to prevent. The only preventative measures we found 
for Yarrow was to try to make the interface reasonably simple so that the pro- 
grammer trying to use Yarrow in a real-world product can use it securely without 
understanding much about how the PRNG works. 

This is an area we are still working on. It is notoriously difficult to make 
security products easy to use for most programmers, and of course, it is very 
hard to be certain there are no errors in the Yarrow generator itself. 

One thing we can do is to make it easy to verify the correct implementation of 
a Yarrow PRNG. We have carefully designed Yarrow to be portable and precisely 
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defined. This allows us to create test vectors that can be used to verify that a 
Yarrow implementation is in fact working correctly. Without such test vectors 
an implementor would never be able to ensure that her Yarrow implementation 
was indeed working correctly. 



Cryptanalytic Attacks on PRNG Generation Mechanisms. Between re- 
seedings, the PRNG output generation mechanism is basically a stream cipher. 
Like any other stream cipher, it is possible that the one used in a PRNG will 
have some cryptanalytic weakness that makes the output stream somewhat pre- 
dictable or at least recognizable. The process of finding weaknesses in this part 
of the PRNG is the same as finding them in a stream cipher. 

We have not seen a lot of PRNGs that were easily vulnerable to this kind of 
attack. Most PRNGs’ generation mechanisms are based on strong cryptographic 
mechanisms already. Thus, while this kind of attack is always a concern, it usually 
does not seem to break the PRNG. To be safe, we have designed Yarrow to be 
based on a block cipher; if the block cipher is secure, then so is the generation 
mechanism. This was done because there are quite a number of apparently-secure 
block ciphers available in the public domain. 



Side-Channel Attacks. Side-channel attacks are attacks that use ad ditional 
information about the inner workings of the implementation Bsa : timing 
attacks and power analysis are typical examples. Many PRNGs 

that are otherwise secure fall apart when any additional information about their 
internal operations are leaked. One example of this is the RSAREF 2.0 PRNG, 
which can be implemented in a way that is vulnerable to a timing attack. 

It is probably not possible to protect against side-channel attacks in the 
design of algorithms. However, we do try to avoid obvious weaknesses, specifically 
any data-dependent execution paths. 



Chosen-Input Attacks on the PRNG. An attacker is not always limited to 
just observing PRNG outputs. It is sometimes possible to gain control over some 
of the samples sent into the PRNG, especially in a tamper-resistant token. Some 
PRNGs, such as the RSAREF 2.0 PRNG, are vulnerable to such attacks. In the 
worst case the attacker can mount an adaptive attack in which the samples are 
selected based on the output that the PRNG provides. To avoid this kind of 
attack in Yarrow, all samples are processed by a cryptographic hash function, 
and are combined with the existing key using a secure update function. 

3.2 How Compromises Are Exploited 

Once the key is compromised, it is interesting to consider how this compromise 
is exploited. Since it is not always possible to prevent an attacker from learning 
the key, it is reasonable to spend some serious time and effort making sure the 
PRNG can recover its security from a key compromise. 
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Permanent Compromise Attacks. Some PRNGs, such as the one proposed 
in ANSI X9.17, have the property that once the key has been compromised, an 
attacker is forever after able to predict their outputs. This is a terrible property 
for a PRNG to have, and we have made sure that Yarrow can recover from a 
key compromise. 



Iterative Guessing Attacks. If the samples are mixed in with the key as they 
arrive, an attacker who knows the PRNG key can guess the next “unpredictable” 
sample, observe the next PRNG output, and test his guess by seeing if they agree. 
This means that a PRNG which mixes in samples with 32 bits of entropy every 
few output words will not recover from a key compromise until the attacker is 
unable to see the effects of three or four such samples on the outputs. This is 
called an iterative guessing attack, and the only way to resist it is to collect 
entropy samples in a pool separate from the key, and only reseed the key when 
the contents of the entropy pool is unguessable to any real-world attacker. This 
is what Yarrow does. 



Backtracking Attacks. Some PRNGs, such as the RSAREF 2.0 PRNG, are 
easy to run backwards as well as forward. This means that an attacker that has 
compromised the PRNG’s key after a high-value RSA key pair was generated 
can still go back and learn that high-value key pair. We include a mechanism in 
Yarrow to limit backtracking attacks to a limited number of output bytes. 



Compromise of High-Value Keys Generated From Compromised Key. 

Of course, the biggest cost of a compromised PRNG is that it leads to com- 
promised system- keys if the key generation process uses the PRNG. If the key 
that is being generated is very valuable, the harm to the system owner can be 
very large. As we mentioned, the iterative guessing attacks require us to collect 
entropy in a pool before reseeding the generator with it. When we are about 
to generate a very valuable key, it is preferable to have whatever extra entropy 
there is in the PRNG’s key. Therefore, the user can request an explicit reseed of 
the generator. This feature is intended to be used rarely and only for generating 
high-value secrets. 

4 The Yarrow Design: Components 

In this section, we discuss the components of Yarrow, and how they interact. A 
major design principle of Yarrow is that its components are more-or-less inde- 
pendent, so that systems with various design constraints can still use the general 
Yarrow design. 

The use of algorithm-independent components in the top level design is a key 
concept in Yarrow. Our goal is not to increase the number of security primitives 
that a cryptographic system is based on, but to leverage existing primitives as 
much as possible. Hence, we rely on one-way hash functions and block ciphers. 
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Fig. 1. Generic block diagram of Yarrow 



two of the best-studied and most widely available cryptographic primitives, in 
our design. 

There are four major components: 

An Entropy Accumulator which collects samples from entropy sources, 
and collects them in the two pools. 

A Reseed Mechanism which periodically reseeds the key with new entropy 
from the pools. 

A Generation Mechanism which generates PRNG outputs from the key. 
A Reseed control that determines when a reseed is to be performed. 

Below, we specify each component’s role in the larger PRNG design, we 
discuss the requirements for each component in terms of both security and per- 
formance, and we discuss the way each component must interact with each other 
component. Later in this paper, we will discuss specific choices for these compo- 
nents. 



1 . 

2 . 

3. 

4. 



4.1 Design Philosophy 

We have seen two basic design philosophies for PRNGs. 

One approach assumes that it is usually possible to collect and distill enough 
entropy from the samples that each of the output bits should have one bit of real 
entropy. If more output is required than entropy has been collected from the sam- 
ples, the PRNG either stops generating outputs or falls back on a cryptographic 
mechanism to generate the outputs. Golin Plumb’s PGP PRNG and Gutmann’s 
Gryptlib PRNG both fall into this category. In this kind of design, entropy is 
accumulated to be immediately reused as output, and the whole PRNG mecha- 
nism may be seen as a mechanism to distill and measure entropy from various 
sources on the machine, and a buffer to store this entropy until it is used. 

Yarrow takes a different approach. We assume that we can accumulate enough 
entropy to get the PRNG into an unguessable state (without such an assumption, 
there is no point designing a PRNG). Once at that starting point, we believe we 
have cryptographic mechanisms that will generate outputs an attacker cannot 
distinguish from random outputs. In our approach, the purpose of accumulating 
entropy is to be able to recover from PRNG key compromises. The PRNG is 
designed so that, once it has a secure key, even if all other entropy accumulated 
is predictable by, or even under the control of, an attacker, the PRNG is still 
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secure. This is also the approach taken by the RSAREF, DSA, and ANSI X9.17 
PRNGs. 

The strength of the first approach is that, if properly designed, it is possible 
to get unconditional security from the PRNG. That is, if the PRNG really does 
accumulate enough entropy to provide for all its outputs, even breaking some 
strong cipher like triple-DES will not be sufficient to let an attacker predict 
unknown PRNG outputs. The weakness of the approach is that the strength of 
the PRNG is based in a critical way on the mechanisms used to estimate and 
distill entropy. While this is inevitably true of all PRNGs, with a design like 
Yarrow we can afford to be far more conservative in our entropy estimates, since 
we are not expecting to be able to distill enough entropy to provide for all our 
outputs. In our opinion, entropy estimation is the hardest part of PRNG design. 
By contrast, the design of a generation mechanism that will resist cryptanalysis 
is a relatively easy task, making use of available cryptographic primitives such 
as a block cipher. 

Practical cryptographic systems rely on the strength of various algorithms, 
such as block ciphers, stream ciphers, hash functions, digital signature schemes, 
and public key ciphers. We feel that basing the strength of our PRNG on well- 
trusted cryptographic mechanisms is as reasonable as basing the strength of our 
systems on them. 

This approach raises two important issues, which should be made explicit: 

1. Yarrow’s outputs are cryptographically derived. Systems that use Yarrow’s 
outputs are no more secure than the generation mechanism used. Thus, un- 
conditional security is not available in systems like one-time pads, blind 
signature schemes, and threshold schemes. Those mechanisms are capable of 
unconditional security, but an attacker capable of breaking Yarrow’s gener- 
ation mechanism will be able to break a system that trust Yarrow outputs 
to be random. This is true even if Yarrow is accumulating far more entropy 
from the samples than it is producing as output. 

2. Like any other cryptographic primitive, a Yarrow generator has a limited 
strength which we express in the size of the key. Yarrow- 160 relies on the 
strength of three- key triple-DES and SHA-1, and has an effective key size 
of about 160 bits. Systems that have switched to new cryptographic mech- 
anisms (such as the new AES cipher, when it is selected) in the interests 
of getting higher security should also use a different version of Yarrow to 
rely on those new mechanisms. If a longer key is necessary, then a future 
“larger” version of Yarrow should be used; it makes no sense to use a 160-bit 
PRNG to generate a 256-bit key for a block cipher, if 256 bits of security 
are actually required. 



4.2 Entropy Accumulator 

Entropy Accumulation. Entropy accumulation is the process by which a 
PRNG acquires a new, unguessable internal state. During initialization of the 
PRNG, and for reseeding during operation, it is critical that we successfully 
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accumulate entropy from the samples. To avoid iterative guessing attacks and 
still regularly reseed the PRNG it is important that we correctly estimate the 
amount of entropy we have collected thus far. The entropy accumulation mech- 
anism must also resist chosen-input attacks, in the sense that it must not be 
possible for an attacker who controls some of the samples, but does not know 
others, to cause the PRNG to lose the entropy from the unknown samples. 

In Yarrow, entropy from the samples is collected into two pools, each a hashing 
context. The two pools are the fast pool and the slow pool; the fast pool provides 
frequent reseeds of the key, to ensure that key compromises have as short a 
duration as possible when our entropy estimates of each source are reasonably 
accurate. The slow pool provides rare, but extremely conservative, reseeds of the 
key. This is intended to ensure that even when our entropy estimates are very 
optimistic, we still eventually get a secure reseed. Alternating input samples are 
sent into the fast and slow pools. 

Each pool contains the running hash of all inputs fed into it since it was last 
used to carry our a reseed. 

In Yarrow-160, the pools are each SHA-1 contexts, and thus are 160 bits wide. 
Naturally, no more than 160 bits of entropy can be collected in these pools, and 
this determines the design strength of Yarrow-160 to be no greater than 160 bits. 

The following are the requirements for the entropy accumulation component: 

1. We must expect to accumulate nearly all entropy from the samples, up to 
the size of a pool, even when the entropy is distributed in various odd ways 
in those samples, e.g., always in the last bit, or no entropy in most samples, 
but occasional samples with nearly 100 bits of entropy in a 100-bit sample, 
etc. 

2. An attacker must not be able to choose samples to undo the effects of those 
samples he does not know on a pool. 

3. An attacker must not be able to force a pool into any kind of weak state, 
from which it cannot collect entropy successfully. 

4. An attacker who can choose which bits in which samples will be unknown 
to him, but still has to allow n unknown bits, must not be able to narrow 
down the number of states in a pool to substantially fewer than 2”. 

Note that this last condition is a very strong requirement. This virtually 
requires the use of a cryptographic hash function. 



Entropy Estimation. Entropy estimation is the process of determining how 
much work it would take an attacker to guess the current contents of our pools. 
The general method of Yarrow is to group the samples into sources and estimate 
the entropy contribution of each source separately. To do this we estimate the 
entropy of each sample separately, and then add these estimates of all samples 
that came from the same source. 

The assumption behind this grouping into sources is that we do not want 
our PRNG’s reseeding taking place based on only one source’s effects. Other- 
wise, one source which appears to provide lots of entropy, but instead provides 
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relatively little, will keep causing the PRNG to reseed, and will leave it vulner- 
able to an iterative guessing attack. We thus allow a single fast source to cause 
frequent reseeding from the fast pool, but not the slow pool. This ensures that 
we reseed frequently, but if our entropy estimates from our best source are wildly 
inaccurate, we still will eventually reseed from the slow pool, based on entropy 
estimates of a different source. Recall that samples from each source alternate 
between the two pools. 

Implementors should be careful in determining their sources. The sources 
should not be closely linked or exhibit any significant correlations. 

The entropy of each sample is measured in three ways: 

— The programmer supplies an estimate of entropy in a sample when he writes 
the routine to collect data from that source. Thus, the programmer might 
send in a sample, with an estimate of 20 bits of entropy. 

— For each source a specialized statistical estimator is used to estimate the 
entropy of the sample. This test is geared towards detecting abnormal situ- 
ations in which the samples have a very low entropy. 

— There is a system-wide maximum “density” of the sample, by considering 
the length of the sample in bits, and multiplying it by some constant factor 
less than one to get a maximum estimate of entropy in the sample. Currently, 
we use a multiplier of 0.5 in Yarrow-160. 

We use the smallest of these three estimates as the entropy of the sample in 
question. 

The specific statistical tests used depends on the nature of the source and 
can be changed in different implementations. This is just another component, 
which can be swapped out and replaced by better-suited components in different 
environments. 



4.3 Generating Pseudorandom Outputs 

The Generation Mechanism provides the PRNG output. The output must have 
the property that, if an attacker does not know the PRNG’s key, he cannot 
distinguish the PRNG’s output from a truly random sequence of bits. 

The generation mechanism must have the following properties: 

— Resistant to cryptanalytic attack, 

— efficient, 

— resistant to backtracking after a key compromise, 

— capable of generating a very long sequence of outputs securely without re- 
seeding. 



4.4 Reseed Mechanism 

The Reseed Mechanism connects the entropy accumulator to the generating 
mechanism. When the reseed control determines that a reseed is required, the 
reseeding component must update the key used by the generating mechanism 
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with information from one or both of the pools being maintained by the entropy 
accumulator, in such a way that if either the key or the pool(s) are unknown to 
the attacker before the reseed, the key will be unknown to the attacker after the 
reseed. It must also be possible to make reseeding computationally expensive to 
add difficulty to attacks based on guessing unknown input samples. 

Reseeding from the fast pool uses the current key and the hash of all inputs 
to the fast pool since the last reseed (or since startup) to generate a new key. 
After this is done, the entropy estimates for the fast pool are all reset to zero. 

Reseeding from the slow pool uses the current key, the hash of all inputs to 
the fast pool, and the hash of all inputs to the slow pool, to generate a new key. 
After this is done, the entropy estimates for both pools are reset to zero. 

4.5 Reseed Control 

The Reseed Control mechanism must weigh various considerations. Frequent 
reseeding is desirable, but it makes an iterative guessing attack more likely. 
Infrequent reseeding gives an attacker that has compromised the key more in- 
formation. The design of the reseed control mechanism is a compromise between 
these goals. 

We keep entropy estimates for each source as the samples have gone into 
each pool. When any source in the fast pool has passed a threshhold value, we 
reseed from the fast pool. In many systems, we would expect this to happen 
many times per hour. When any k of the n sources have hit a higher threshhold 
in the slow pool, we reseed from the slow pool. This is a much slower process. 

For Yarrow-160, the threshhold for the fast pool is 100 bits, and for the 
slow pool, is 160 bits. At least two different sources must be over 160 bits in 
the slow pool before the slow pool reseeds, by default. (This should be tunable 
for different environments; environments with three good and reasonably fast 
entropy sources should set fc = 3.) 

5 The Generic Yarrow Design and Yarrow-160 

In this section, we describe the generic Yarrow design. This is a generic descrip- 
tion, using an arbitrary block cipher and hash function. If both algorithms are 
secure, and the PRNG gets sufficient starting entropy, our construction results 
in a strong PRNG. We also discuss the specific parameters and primitives used 
in Yarrow-160. 

We need two algorithms, with properties as follows: 

— A one-way hash function, h{x), with an m-bit output size, 

— A block cipher, E{), with a fc-bit key size and an n-bit block size. 

The hash function is assumed to have the following properties: 

— Gollision intractable. 

— One-way. 
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— Given any set M of possible input values, the output values are distributed 
as |M| selections of the uniform distribution over m-bit values. 

The last requirements implies several things. Even if the attacker knows 
most of the input to the hash function, he still has no effective knowledge about 
the output unless he can enumerate the set of possible inputs. It also makes 
it impossible to control any property of the output value unless you have full 
control over the input. 

The block cipher is assumed to have the following properties: 

— It is resistant to known-plaintext and chosen-plaintext attacks, even those 
requiring enormous numbers of plaintexts and their corresponding cipher- 
texts, 

— Good statistical properties of outputs, even given highly patterned inputs. 

The strength (in bits) of the resulting PRNG is limited by min(m, fc). In 
practice even this limit will not quite be reached. The reason is that if you take 
an m bit random value and apply a hash function that produces m bits of output, 
the result has less than m bits of entropy due to the collisions that occur. This 
is a very minor effect, and overall results in the loss of at most a few bits of 
entropy. We ignore this small constant factor, and say that the PRNG has a 
strength of min(m, fc) bits. 

Yarrow-160 uses the SHAl hash function for h{), and three-key 
triple-DES for Ek{)- 

5.1 Generation Mechanism 

FigureHshows the generator which is based on using the block cipher in counter 
mode. 

We have an n-bit counter value C . To generate the next n-bit output block, 
we increment C and encrypt it with our block cipher, using the key K. To 
generate the next output block we thus do the following: 

C^{C+l) mod 2" 

R^Ek{C) 
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where R is the next output block and K is the current PRNG key. 

If the key is compromised at a certain point in time, the PRNG must not leak 
too many ‘old’ outputs that were generated before the compromise. It is clear 
that this generation mechanism has no inherent resistance to this kind of attack. 
For that reason, we keep count of how many blocks we have output. Once we 
reach some limit Pg (a system security parameter, 1 < Pg < 2”/^), we generate 
k bits of PRNG output, and use them as the new key. 

K ^ Next k bits of PRNG output 

We call this operation a generator gate. Note that this is not a reseeding opera- 
tion as no new entropy is introduced into the key. 

In the interests of keeping an extremely conservative design, the maximum 
number of outputs from the generator between reseedings is limited to min{2”, 
2’^/^Pg} n-bit output blocks. The first term in the minimum prevents the value 
C from cycling. The second term makes it extremely unlikely that K will take 
on the same value twice. In practice, Pg should be set much lower than this, e.g. 
Pg = 10, in order to minimize the number of outputs that can be learned by 
backtracking. 

In Yarrow-160, we use three-key triple-DES in counter mode to 
generate outputs, and plan to apply the generator gate every ten 
outputs. (That is, Pg = 10 .) 



Security Arguments. 

Normal Operations. Gonsider an attacker who can, after seeing a long sequence 
of outputs from this generator under the same key K, extract the key. This can 
be converted into a chosen plaintext attack on the cipher to extract its key. 

Gonsider an attacker who can, after seeing a long sequence of outputs from 
this generator under the same key, predict a single future or past output value. 
The algorithm used by the attacker performs a chosen-plaintext attack on the 
underlying block cipher, allowing the prediction of (part of) one ciphertext after 
some number of encryptions of chosen plaintexts have been seen. This is enough 
of a demonstrated weakness to rule the cipher out for many uses, e.g. in GBG- 
MAG. 

Backtracking Protection. Gonsider an attacker who can use the outputs after a 
generator gate has taken place to mount an attack on the data generated before 
the generator gate. The same attacker can mount his attack on the generator 
without the generator gate by using k known bits of the generator output to form 
a new key, using that key to generate a sequence of outputs, and then applying 
the attack. (This is possible as the counter value C is assumed to be known to 
the attacker.) Thus, a generator gate cannot expose previous output values to 
attack without also demonstrating a weakness in the generation mechanism in 
general. 
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Consider an attacker who compromises the current key of the PRNG some- 
how. Suppose he can learn a previous key from the current key. To do this, 
he must be able to extract the key of the block cipher given a small number 
of bits of the generator’s output. Thus, the attacker must defeat the generator 
mechanism to defeat the generator gate mechanism. 

Consider an attacker who can predict the next key generated by the generator 
gate. The same method he uses to do this can be used to predict the next PRNG 
output, if the generator is used without generator gate. 



Limits on Generator Outputs. As the number of output blocks from the 
basic generator available to the attacker grows closer to and beyond 2”/^ it 
becomes easier and easier to distinguish the cipher’s outputs from a real random 
sequence. A random sequence should have collisions in some n-bit output blocks, 
but there will be no repetitions of output blocks in the output from running a 
block cipher in counter mode. This means that a conservative design should re- 
key long before this happens. This is the reason why we require the generator gate 
to be used at least once every 2”/^ output blocks. Note that Pg is a configurable 
parameter and can be set to smaller values. Smaller values of Pg increase the 
number of generator gates and thus decrease the amount of old data an attacker 
can retrieve if he were to find the current key. The disadvantage of very small 
Pg values is that performance suffers, especially if a block cipher is used that 
has an expensive key schedule. 

Each time we use the generator gate, we generate a new key from the old 
key using a function that we can assume to behave as a random function. This 
function is not the same function for each generator gate, as the counter C 
changes in value. There are therefore no direct cycles for K to fall into. Any cycle 
would require C to wrap around, which we do not allow between reseedings. To 
be on the safe side we do restrict the number of generator gate operations to 
2^/3 which makes it extremely unlikely that the same value K will be used twice 
between reseedings. 



Implementation Ideas. The use of counter mode allows several output blocks 
to be computed together, or even in parallel. A hardware implementation can 
exploit this parallelism using a pipelined design, and software implementations 
could use a bit-sliced implementation of the block cipher for higher performance. 
Even for simple software implementations it might very well be more efficient 
to produce many blocks at a time and to buffer the output in a secure memory 
area. This improves the locality of the code, and can improve the cache-hit ratio 
of the program. 



5.2 Entropy Accumulator 

To accumulate the entropy from a sequence of inputs, we concatenate all the 
inputs. Once we have collected enough entropy we apply the hash function h to 
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the concatenation of all inputs. We alternate applying samples from each source 
to each pool. 

In Yarrow-160, we use the SHAl hash function to accumulate in- 
puts in this way. We alternate feeding inputs from each source into 
the fast and slow pools; each pool is its own SHAl hash context, and 
thus effectively contains the SHAl hash of all inputs fed into that 
pool. 

Security Arguments. If we believe that an attacker cannot find collisions in 
the hash function, then we must also believe that an attacker cannot be helped 
by any collisions that exist. 

Consider the situation of an attacker trying to predict the whole sequence 
of inputs to be fed into the user’s entropy accumulator. The attacker’s best 
strategy is to try to generate a list of the most likely input sequences, in order of 
decreasing probability. If he can generate a list that is feasible for him to search 
through which has a reasonable probability (say, a 10“® chance) of containing the 
actual sequence of samples, he has a worthwhile attack. Ultimately, an attacker 
in this position cannot be resisted effectively by the design of the algorithm, 
though we do our best. He can only be resisted by the use of better entropy 
sources, and by better estimation of the entropy in the pool. 

Now, how can the entropy accumulator help the attacker? Only by reducing 
the total number of different input sequences he must test. However, in order 
for the attacker to see a single pair of different input sequences that will lead to 
the same entropy pool contents he must find a pair of distinct input sequences 
that have the same hash value. 

Implementation Ideas. All common hash functions can be computed in an 
incremental manner. The input string is usually partitioned into fixed size blocks, 
and these blocks are processed sequentially by the hash function. This allows 
an implementation to compute the hash of the sequence of inputs on the fly. 
Instead of concatenating all inputs and applying the hash function in one go 
(which would require an unbounded amount of memory) the software can use a 
fixed size buffer and compute the hash partially whenever the buffer is full. 

As with the generator mechanism, the locality of the code can be improved by 
using a buffer that is larger than one hash function input block. The entropy ac- 
cumulator would thus accumulate several blocks worth of samples before hashing 
the entire buffer. 

The entropy accumulator should be careful not to generate any overflows 
while adding up the entropy estimates. As there is no limit on the number of 
samples the accumulator might have to process between two reseeds the imple- 
mentation has to handle this case. 

5.3 Reseed Mechanism 

The reseeding mechanism generates a new key K for the generator from the 
entropy accumulator’s pool and the existing key. The execution time of the 
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reseed mechanism depends on a parameter Pt > 0. This parameter can either 
be fixed for the implementation or be dynamically adjusted. 

The reseed process consists of the following steps: 

1. The entropy accumulator computes the hash on the concatenation of all the 

inputs into the fast pool. We call the result vq. 

2. Set Vi := h{vi-i\vo\i) for i = 1, . . . ,t. 

3. Set if <— fc). 

4. Set Ek{0). 

5. Reset all entropy estimate accumulators of the entropy accumulator to zero. 

6. Wipe the memory of all intermediate values 

7. If a seed file is in use, the next 2k bits of output from the generator are 

written to the seed file, overwriting any old values. 

Step H gathers the output from the entropy accumulator. Step H uses an 
iterative formula of length Pt to make the reseeding computationally expensive 
if desired. Step^uses the hash function h and a function h' , which we will define 
shortly, to create a new key K from the existing key and the new entropy value 
upj. StepHdefines the new value of the counter C. 

The function h' is defined in terms of h. To compute h\m, k) we construct 

50 := m 

51 := h{so\ . . .\si-i) i = l,... 
h' {m, k) := first k bits of (so|si | . . .) 

This is effectively a ‘size adaptor’ function that converts an input of any 
length to an output of the specified length. If the input is larger than the desired 
output, the function takes the leading bits of the input. If the input is the same 
size as the output the function is the identity function. If the input is smaller 
than the output the extra bits are generated using the hash function. This is a 
very expensive type of PRNG, but for the small sizes we are using this is not a 
problem. 

There is no security reason why we would set a new value for the counter C. 
This is done to allow more implementation flexibility and still maintain compat- 
ibility between different implementations. Setting the counter C makes it simple 
for an implementation to generate a whole buffer of output from the generator 
at once. If a reseed occurs, the new output should be derived from the new seed 
and not from the old output buffer. Setting a new C value makes this simple: 
any data in the output buffer is simply discarded. Simply re-using the existing 
counter value is not compatible as different implementations have different sizes 
of output buffers, and thus the counter has been advanced to different points. 
Rewinding the counter to the virtual ‘current’ position is error-prone. 

To reseed the slow pool, we feed the hash of the slow pool into the fast pool, 
and then do a reseed. In general, this slow reseed should have Pt set as high as 
is tolerable. 

In Yarrow-160, this is done as described above, but using SHAl 
and triple-DES. We generate a three-key triple-DES key from the 
hash of the contents of the pool or pools used, and the current key. 
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Security Arguments. Consider an attacker who starts out knowing the gen- 
erator key but not the contents of the entropy pool hash vq. The value vp^ is 
a pure function of vq, so the attacker has no real information about vp^. This 
value is then hashed with K, and the result is size-adjusted to be the new key. 
As the result of the hash has as much entropy as vp^ has, the attacker loses his 
knowledge about K. 

Consider an attacker in the opposite situation: he starts out knowing the 
samples that have been processed, but not the current generator key. The at- 
tacker thus knows vp^. However, an attacker with no knowledge of the key K 
cannot predict the result of the hash, and thus ends up knowing nothing about 
the new key. 

5.4 Reseed Control 

The reseed control module determines when a reseed is to be performed. An ex- 
plicit reseed occurs when some application explicitly asks for a reseed operation. 
This is intended to be used only rarely, and only by applications that generate 
very high-valued random secrets. Access to the explicit reseed function should 
be restricted in many cases. 

The reseed periodically occurs automatically. The fast pool is used to reseed 
whenever any of its sources have an entropy estimate of over some threshhold 
value. The slow pool is used to reseed whenever at least two of its sources have 
entropy estimates above some other threshhold value. 

In Yarrow-160, the fast pool threshhold is 100 bits, and the slow 
pool threshhold is 160 bits. Two sources must pass the threshhold for 
the slow pool to reseed. 

6 Open Questions and Plans for the Future 

Yarrow-160, our current construction, is limited to at most 160 bits of security 
by the size of its entropy accumulation pools. Three-key triple-DES has known 
attacks considerably better than brute-force; however, the backtracking preven- 
tion mechanism changes keys often enough that the cipher still has about 160 
bits of security in practice. 

At some point in the future, we expect to see a new block cipher standard, 
the AES. Yarrow’s basic design can easily accommodate a new block cipher. 
However, we will also have to either change hash functions, or come up with some 
special hash function construction to provide more than 160 bits of entropy pool. 
For AES with 128 bits, this will not be an issue; for AES with 192 bits or 256 
bits, it will have to be dealt with. We note that the generic Yarrow framework 
will accomodate the AES block cipher and a 256-bit hash function (perhaps 
constructed from the AES block cipher) with no problems. 

In practice, we expect any weaknesses in Yarrow- 160 to come from poorly 
estimating entropy, not from cryptanalysis. For that reason, we hope to continue 
to improve the Yarrow entropy estimation mechanisms. This is the subject of 
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ongoing research; as better estimation tools become available, we will upgrade 
Yarrow to use them. 

We still have to create a reference implementation of Yarrow- 160, and create 
test vectors for various parameter sets. These test vectors will test all aspects 
of the generator. This will probably require the use of Yarrow-160 versions with 
different parameters then the ones used in Yarrow-160; the details of this remain 
to be investigated. 

The reseed control rules are still an ad-hoc design. Further study might yield 
an improves set of reseed control rules. This is the subject of ongoing research. 

7 On the Name “Yarrow” 

Yarrow is a flowering perennial with distinctive flat flower heads and lacy leaves, 
like Queen Anne’s Lace or wild carrot. Yarrow stalks have been used for div- 
ination in China since the Hsia dynasty, in the second millennium B.C.E. The 
fortuneteller would divide a set of 50 stalks into piles, then repeatedly use modulo 
arithmetic to generate two bits of random information (but with a nonuniform 
distribution) . 

Here is the full description of the method: The most notable things are: one, 
it takes an amazing amount of effort to generate two random bits; and two, it 
does not produce a flat output distribution, but, apparently, 1/16-3/16-5/16 
- 7/16. 

The oracle is consulted with the help of yarrow stalks. These stalks are short 
lengths of bamboo, about four inches in length and an eighth inch in diameter. 
Fifty stalks are used for this purpose. One is put aside and plays no further part. 
The remaining 49 stalks are first divided into two random heaps. One stalk is 
then taken from the right-hand heap and put between the ring finger and the 
little finger of left hand. Then the left-hand heap is placed in the left hand, 
and the right hand takes from it bundles of 4, until there are 4 or fewer stalks 
remaining. This remainder is placed between the ring finger and the middle 
finger of the left hand. Next the right-hand heap is counted off by fours, and 
the remainder is placed between the middle finger and the forefinger of the left 
hand. The sum of the stalks now between the fingers of the left hand is either 
9 or 5. (The various possibilities are 1 -I- 4 -|- 4, or 1 -|- 3 -I- 1, or 1 -|- 2 -|- 2, 
or 1 -|- 1 -b 3; it follows that the number 5 is easier to obtain than the number 
9.) At this first counting off of the stalks, the first stalk — held between the little 
finger and the ring finger — is disregarded as supernumerary, hence one reckons 
as follows: 9 = 8, or 5 = 4. The number 4 is regarded as a complete unit, to 
which the numerical value 3 is assigned. The number 8, on the other hand, is 
regarded as a double unit and is reckoned as having only the numerical value 
2. Therefore, if at the first count 9 stalks are left over, they count as 2; if 5 are 
left, they count as 3. These stalks are now laid aside for the time being. 

Then the remaining stalks are gathered together again and divided anew. 
Once more one takes a stalk from the pile on the right and places it between the 
ring finger and the little finger of the left hand; then one counts off the stalks 
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as before. This time the sum of the remainders is either 8 or 4, the possible 
combinations being 1 + 4 + 3, or 1 + 3 + 4, or 1 + 1 + 2, or 1 + 2 + 1, so 
that this time the chances of obtaining 8 or 4 are equal. The 8 counts as a 2, the 
4 counts as a 3. The procedure is carried out a third time with the remaining 
stalks, and again the sum of the remainders is 8 or 4. 

Now from the numerical values assigned to each of the three composite re- 
mainders, a line is formed with a total value of 6, 7, 8, or 9. 

Yarrow stalks are still used for fortunetelling in China, but with a greatly 
simplified method: shake a container of 100 numbered yarrow stalks until one 
comes out. This random number is used as an index into a table of fortunes. 
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Abstract. In this paper, we introduce a new approach to the genera- 
tion of binary sequences by applying trace functions to elliptic curves 
over GF{2'^). We call these sequences elliptic curve pseudorandom se- 
quences (EC-sequence). We determine their periods, distribution of zeros 
and ones, and linear spans for a class of EC-sequences generated from 
supersingular curves. We exhibit a class of EC-sequences which has half 
period as a lower bound for their linear spans. EC-sequences can be 
constructed algebraically and can be generated efficiently in software 
or hardware by the same methods that are used for implementation of 
elliptic curve public-key cryptosystems. 



1 Introduction 

It is a well-known result that any periodic binary sequence can be decomposed as 
a sum of linear feedback shift register (LFSR) sequences and can be considered as 
a sequence arising from operating a trace function on a Reed- Solomon codeword 
13, 13’ More precisely, let a be a primitive element of a finite field F 2 " and let 
C = {ri, • • • , Ts}, 0 < Vi < 2" — 1, be the null spectrum set of a Reed-Solomon 
code. If we want to transmit a message m = (mi, • • • , mg), € F 2 ", over a 
noisy channel, then first we form a polynomial g(x) = Ei=o £^nd then 

compute Cj = g{a^). The codeword is c = (cq, ci, • • • , C 2 "- 2 )- Now we apply the 
trace function from F 2 ™ to F 2 to this codeword, i.e., we compute 

a, = Tr{ci) = Tr{g{a^)),i = 0, 1, • •• , 2" - 2. (1) 

Then the resulting sequence A = {oi} is a binary sequence having period which 
is a factor of 2” — 1. All periodic binary sequences can be reduced to this model. 
Note that if g{x) = x, then A is an m-sequence of period 2" — 1. A lot of research 

Howard Heys and Carlisle Adams (Eds.): SAC’99, LNCS 1758, pp. 34-^^2000. 
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has been done concerning ways to choose the function g(x) such that the result- 
ing sequence has the good statistical properties. Examples include filter function 
generators ^3, pi s. | , combinatorial function generators 
clock controlled generators and shrinking generators^, Q. Unfortunately, the 
trace function destroys the structure of Reed-Solomon code. It is difficult to 
get sequences satisfying cryptographic requirements from this approach. If one 
can specify the linear span, then there is no obvious method to determine the 
statistical properties of the resulting sequences. Examples include many conjec- 
tured sequences with two-level autocorrelation or lower level cross correlation 
13, 13’ If 0116 can fix the parameters for good statistical properties, then all 
known sequences have low linear spans in the sense that ratio of linear span to 
the period is much less than 1/2. 

Note that if a binary sequence of period 2" has the property that each n- 
tuple occurs exactly once in one period, then it is called a de Bruijn sequence Q. 
Chan et al. proved that de Bruijn sequences have large linear spans Q. From a 
de Bruijn sequence of period 2" one can construct a binary sequence of period 
2" — 1 by deleting one zero from the unique run of zeros of length n. The resulting 
sequence is called a modified de Bruijn sequence^ see ^3- There is no theoretical 
result on the linear spans of such sequences except for m-sequences. Experimental 
computation on the linear spans of the modified sequences have only been done 
for the sequences with period 15, 31 and 63 33- Another problem that de 
Bruijn sequences have is that they are difficult to implement. All algorithms 
for constructing de Bruijn sequences (except for a class constructed from the 
m-sequences of period 2" — 1) require a huge memory space. It is infeasible to 
construct a de Bruijn sequence or a nonlinear modified de Bruijn sequence with 
period 2" when n > 30 Q, B, Q. (It is a well known fact that in design of 
secure systems, if one sequence can be obtained by removing or inserting one bit 
from another sequence, and the resulting sequence has a large linear span, then 
it is not considered as secure. Consequently, the de Bruijn sequences of period 
2" constructed from m-sequences of period 2" — 1 by inserting one zero into the 
run of zeros of length n — 1 of the m-sequence are not considered to be good 
pseudorandom sequences. ) 

In this paper, we introduce a new method for generating binary sequences. 
We will replace a Reed-Solomon codeword in Q by the points on an elliptic 
curve over F 2 ". The resulting binary sequences are called elliptic curve pseudo- 
random sequences, or EC-sequences for short. We will discuss constructions and 
representation of EC-sequences, their statistical properties, their periods and 
linear spans. We exhibit a class of EC-sequences which may be suitable for use 
as a key generator in stream cipher cryptosystems. These EC-sequences have 
period equal to 2”+^, the bias for unbalance is [2"/^J and lower bound and up- 
per bounds on their linear spans are 2” and 2”+^ — 2, respectively. It is worth 
pointing out that EC-sequences can be constructed algebraically and they can 
be generated efficiently in software or hardware by the same method that are 
used for implementation of elliptic curve public-key cryptosystems ^3- 
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The paper is organized as follows. In Section 2, we introduce some concepts 
and and preliminary results from sequence analysis and the definition of the 
elliptic curves over F 2 »* . In Section 3, we give a method for construction of EC- 
sequences and their representation by interleaved structure. In Section 4, we 
discuss statistical properties of EC-sequences constructed from supersingular el- 
liptic curves. In Section 5, we determine the periods of EC-sequences constructed 
from supersingular elliptic curves. In Section 6, we derive a lower bound and an 
upper bound for EC-sequences constructed from a class of supersingular elliptic 
curves with order 2” -|- 1. Section 7 shows a class of EC-sequences which are suit- 
able for use as a key generator in stream cipher cryptosystems. A comparison 
of this class of EC sequence generators with the other known pseudo-random 
sequence generators is also included in this section. 

Remark. Kaliski discussed how to generate a pseudo-random sequence from 
elliptic curves in Q, where he used randomness criteria based on the com- 
putational difficulty of the discrete logarithm over the elliptic curves Q. In 
this paper our approach is completely different. We use the unconditional ran- 
domness criteria to measure the EC-sequences and use the trace function to 
obtain binary sequences. A set of the unconditional randomness measurements 
for pseudorandom sequence generators is described as follows: 

— Long period 

— Balance property (Golomb Postulate 1 |) 

— Run property (Golomb Postulate 2) 

— n-tuple distribution 

— Two-level auto correlation (Golomb Postulate 3) 

— Low-level cross correlation 

— Large linear span and smooth increased linear span profiles 

2 Preliminaries 

In this section, we introduce some concepts and preliminary results on sequence 
analysis. 

Let (7 = 2", let 

Fq be a finite field and let [a;] be the ring of polynomials over F^ . 

2.1 Trace Function from Fq to F 2 



Tr{x) = a; -I- -I- • • • -I- : 



,x e Fq. 



Property: Tr{x^ ) = Tr{x) for any positive integer k. 
For a; G Fq, this can be written as 



= xoa + x\a^ -I- • • • -|- Xn-\o? , Xi G {0, 1} 
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where {a, o? ^ ■ ■ ■ ,a^ } is a normal basis of F 2 »*. In this representation, Tr{x) 

can be computed as follows 



Tr{x) = Xo + + • • • + Xn-i- 



2.2 Periods, Characteristic Polynomials, and Minimal Polynomials 
of Sequences 

Let A = {tti} he a, binary sequence. If u is a positive integer such that 

Qi — i — 0, 1, * * * , (2) 

then V is called a length of A. We also write A = (oq, oi, • • • , a„_i), denote 
V = length(A) . Note the index is reduced modulo v. If p is the smallest positive 
integer satisfying then we say p is the period of A, denoted as per (A). It is 
easy to see that p\v. 

Let f{x) = x’’ + ci-ix^~^ + ■ ■ - + cia; + co G F 2 [x]. If f{x) satisfies the following 
recursive relation: 



i-i 

e^l+k — ^ ^ “t“ * * • “t“ CiCti-|_fc -t“ CqCL}^^ k — 0,1,*** 

i=0 

then we say f{x) is a characteristic polynomial of A over F 2 . 

The left shift operator L is defined as 



L{A) = 01,02, * * *, 



For any z > 0, 



L (^A^ — Oi+l, * * * , 



We denote L^{A) = A for convention. If f{x) is a characteristic polynomial of 
A over F 2 , then 

i 

f{L)A = J2c^L\A) = 0 

i=0 



where 0 represents a sequence consisting of all zeros. (Note 0 represents a number 
0 or a sequence consisting of all zeros depending on the context.) Let 



G{A) = {f{x)eF 2 [x]\f{L)A = 0}. 

The polynomial in G{A) with the smallest degree, say m(x), is called the min- 
imal polynomial of A over F 2 . Note that G(A) is a principle ideal of F 2 [a;] and 
G(A) =< m(x) >. So, if f(x) is a characteristic polynomial of A over F 2 , then 
f(x) = m(x)h(x) where h(x) € F^lx]. The linear span of A over F 2 , denoted as 
LS{S), is defined as LS{A) = deg{m{x)). 
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2.3 Interleaved Sequences 

We can arrange the elements of the sequence A into 

^ Uq Q>t • ‘ 

0-1 Ot+1 ■ • ■ a(s_i)t+i 

02 Ot+2 ■ • ■ a(s_i)t+2 

\ot-i Ot+t-1 ■ • ■ / 

Let Ai denote the ith row of the above array. Then we also write the sequence 
A = (Ao, Ai, ■ ■ ■ , At-iY' where T is a transpose of a vector. In reference 
A is called an interleaved sequence if Ai, 0 < i < t — 1, has the same minimal 
polynomial over F 2 . Here we generalize this concept to any structures of A^s. 
We still refer to A as a (t, s) interleaved sequence. By using the same approach 
as used in we can have the following proposition. 

Proposition 1 Let v be a length of A and A be a (t, s) interleaved sequence 
where v = ts. Let mi{x) G F 2 [x] be the minimal polynomial of Ai, 1 < i <t and 
m(x) G F 2 [x\ be the minimal polynomial of A, then 

m{x)\mj{x*),0 < j < t — 1. 

2.4 Elliptic Curves over F 2 n. 



An elliptic curve E over F 2 » can be written in the following standard form (see 

D): 



y'^ + y = x^ + CiX + cq, Ci G F2^ 


(3) 


if E is supersingular, or 




y^ + xy = x^ + C2X^ + cq, Ci G F2" 


( 4 ) 


if E is non-supersingular. The points P = (x,y), x,y G F2™ 
equation, together with a “point at infinity” denoted O, form 
{E, +, 0 ) whose identity element is 0. 

Let P = {xi,yi) and Q = (0:2, 2/2) be two different points 
and Q are not equal to the infinity point. 


, that satisfy this 
an Abelian group 

in E and both P 


Addition Law for E supersingular For 2P = P + P = {xs, 2/3), 


Xz = x\ + C4 


( 5 ) 


2/3 = + C4){xi + X3) + yi + 1 


(6) 



For P + Q = (xs, yf), if X\ = X2, then P + Q = O . Otherwise, 

x^ = + xi + X 2 

2/3 = A(xi + + yi + 1 



where A = (?/i + y 2 )/{xi + X 2 ). 

Remark 1 For a detailed treatment of sequence analysis and an introduction to 
elliptic curves, the reader is referred to Q, 



a t by s array as follows: 
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3 Constructions of Pseudorandom Sequences from 
Elliptic Curves over 

In this section, we give a construction of binary sequences from an elliptic curve 
over Fq. 

Let E be an elliptic curve over F^, denoted as if(Fq) or simply E if there 
is no confusion for the field that we work with, and let \E\ be the number of 
points of E over F^. Let P = (xi, yi) be a point of E with order v + 1. Note that 
V + Let r = (P, 2P, • • • , vP) where iP — (xi, yt), 1 < i < v. Note that v 

is even if E is supersingular, v may be odd or even if E is non-supersingular. 
So, we can write v = 21 ii E is supersingular and u = 2^ -|- e, e S P 2 if is 
non-supersingular . 



3.1 Construction 

Let 

Ui = Tr{xi) and bi = Tr{yi),i = 1, 2, • • • , n, (7) 

So = (ai, ■■ - ,ay) and = (5i, • ■■,by). (8) 

Let S = (So,Si)'^ be a (2,v) interleaved sequence, i.e., the elements of S' = 
{si}i>i are given by 

S 2 i-i = tti and S 2 i = bi,i = 1, ■ ■ ■ ,v (9) 

where length{S) = 2v. For a convenient discussion in the following sections, we 
write S starting from 1, we denote 0 as 2v when the index is computed modulo 
2v. We call S a binary elliptic curve pseudorandom sequence generated by E{¥q) 
of type I, an EC-sequence for short. 

Remark 2 In the full paper we discuss two other methods of constructing 
sequences from elliptic curves. 



Let A = (oi, 02, • • • , o/) and B = ( 5 i, 62, • • • , bi). If {7 = (ui, U2, • • • , Ut), then 
we denote U= {ut, Ut-i, • • • , ui), i.e., U written backwards. 

Theorem 1 With the above notation. Let v + 1||P|, and let S = {Sq, Si)’^ be a 
EC-sequence generated by E{¥q) of length 2v whose elements are given by Q. 
Let E be super singular. Then 



Ci ) 



( 10 ) 



Proof. Let E be supersingular. Note that y and y -I- 1 are two roots of Q in F^ 
under the condition Tr{x^ C 4 X + cq) = 0. Since the order of P is u -I- 1, then 



iP + {21 + 1 



i)P = 0 



^Z+2 — ^Z+1 — 2 



Thus we have So = (A, A) and Si = (B, B +1). 
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4 Statistical Properties of Supersingular EC-Sequences 

In this section, we discuss the statistical properties of EC-sequences generated 
by supersingular curves over F2" where n is odd. Let A = (uq, ■ ■ ■ , Op-i), w(A) 
represent the Hamming weight of sequence A. i.e., 

w(A) = |{i I Oi = 1, 0 < i < p}|. 

For convenience, we generalize the notation of Hamming weight of binary se- 
quences to functions from to F2. Let g(x) be a function from F^ to F2, the 
weight of g is defined as w(g) = |{a; € Fg|5(a;) = 1}|- For two isomorphic curves 
F(Fq) and T(Fg), denote this by E = T. From there are three different 
isomorphism classes for supersingular curves over F^ {q — 2") for n odd. 

1. El = {F(Fq)|F(Fq) = + y = a;^} and \E\\ = 2^”“^ and for any E{¥q) G 

El, \E\ = q+l. 

2. E2 = {E(¥q)\E(Fq) ^ y'^ + y = + x}. 

3. E 3 = {if(Fg)|ii;(Fg) = y“^ + y = x^ + X + 1}. 

Here IF2I = IF3I = 22"-2. For any E{¥g) G E2 or F3, \E\ = 2" ± 2("+i)/2 -h 1. 
Let 

E : y“^ + y = x^ + CiX + Cq, C4, Cq G Eg. 



Theorem 2 Let n be odd. Let S = { ^ ^ ] be an EC-sequence generated 

\bb+iJ 

by a supersingular elliptic curve E where length(S) = 2v and v = \E\ — 1. 
Then w{Sq) = 2w{A), w{Si) = v/2 and w{S) = 2w{A) + v/2, where w{A) = 

2ri— 2 2('^— 3)/2 



In order to prove this result, we need the following lemma. If we denote 
h{x) = 2:^-1- C4a; -|- cq, then E can be written as y'^ + y = h{x). 

Lemma 1 Let E and h{x) be defined as above. Then we have 

xG¥2n 

Proof. 

Y (_i)Tr(h(.)) ^ 11^ g = o}| - |{a; G F2n : Tr{h{x)) = I}| 

xG¥2n 

= 2\{x G F2^ : Tr{h{x)) = 0}| - 2" 

= (|F|-I)-2". 
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For z, j = 0, 1, define 



rii^j = |{a; G F 2 " : Tr{x) = i,Tr{h{x)) = j}|. 

Next we determine ni,o- Let F denote the elliptic curve + y = h{x) + x. 
Then the following equations hold: 



nifi + ni,i = 2” ^ 

^0,0 + ^0,1 = 2 " ^ 
no,o + ’t'l.o = (IL'I ~ l)/2 
n- 0,0 + ’t' 1,1 ~ (tio,i + ^^i,o) = 1^1 ~ 1 ~ 2”. 

Note that the last equation follows easily from LemmaJ since 



?^o,o + ’^i.i — (’^o.i + ni,o) = |{a; G F 2 " : Tr{x + h{x)) 

= 0}| — |{a; G F 2 » : Tr{x + h{x)) = 1}|. 



Now, this system of four equations in four unknowns is easily seen to have a 
unique solution. The value of ni^o is £^s stated in the following lemma: 



Lemma 2 Let E^F and n\fi he defined as above. Then we have 



nifi = 



2U-2 



\E\-\F\ 

4 



It is known that \E\ — |F| = for any values of C 4 and cg (This is 

shown in alternatively it follows easily from p.40 and 47.) Thus we have 
the following corollary: 

Corollary 1 Let ni_o he defined as above; then ni^ = 2"“^ ± 

Proof (Proof of Theorem^^. Since length(S) = 2v, from Theorem J we have 
w{Sq) = 2w\a) and w{Si) = v/2. So, 



w{S) = 2w{A) + v/2. (11) 

According to the definition of riij, we have w{A) = nig. From Corollary H 
w(A) = 2 "- 2 ± 2 ("- 3 )/ 2 . 



Remark 3 The value of w{A) depends on the values of and cg. For further 
results on this, we refer the reader to the full version of this work 

5 Periods of Supersingular EC-Sequences 

In this section, we discuss the periods of EC-sequences generated by supersin- 
gular curves. 
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Lemma 3 Let S = {Sq,Si)’^ be a EC-sequence generated by a supersingular 
elliptic curve E(¥q) where Sq = (oi, 02, • • • , a„) and v = \E\ — 1 = 21 . Then 

02i = Oi + Tr{c4),i = 1 , 2, • • • , /. 

Proof. Recall that Oi = Tr{xi). From formula in Section 1, 

X2i = x'l + c\,i=l,---,l. (12) 

02i = Tr{x2i) = Tr{xj + C4) = Tr{xi) + Tr{c4) = Ui + Tr{c4). 



Definition 1 Let U = (ui, U 2 , ■ ■ ■ , U 2 k) be a binary sequence of length 2k. Then 
U is called a coset fixed palindrome sequence of length 2k, CFP-sequence of length 
2k for short, if it satisfies the following two conditions. 

(i) Palindrome Condition (P) 

U = (Uo,Uo) where Uq = (ui, U 2 , • • • , Ufc). 

(ii) Coset Fixed Condition ( CF) 

U 2 i = Ui + c, for each 1 < i < k where c is a constant in F 2 . 



Lemma 4 Let U be a CFP sequence of lenqth 2d and 0 < w(U) < 2d. Then 
per{U) = 2d. 

Proof. We claim that per{U) ^ 2. Otherwise, from the coset fixed condition 
U 2 i = Ui, 1 < i < d, we get w{U) = 0 or w{U) = 2d, which is a contradiction 
with the given condition. Therefore we can write per(U) = t where 2 < t and 
t\2d. If t < 2d, let 2d = ts. Then 



Ut-\-i — Ui, i — 1,2,***. (1^) 

Since U is a, CFP sequence, from condition (i) in Definition 1, we have 

Ud-i = Ud+i+i,0 < i < d- 1. (14) 

From ^3 and we get 

ui-i = ui+i+i,0 < i < I - I (15) 

where ^ = t/2 if t is even and 

ui-i = ui+i, 1 <i <l -1 (16) 

^ = (t + l)/2 if t is odd. From condition 2 in Definition 1, 

U2i = Ui + c, 1 < i < t. (17) 



Since 0 < w{U) < 2d and U satisfies the CF condition, there exists k : 0 < k < I 
such that 



(ut+ 2 fc+i, Wt-|_ 2 fc+ 2 ) — (1)0) or (0,1). 



(18) 
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( For a detailed proof of existence of such fc, please see the full version of this 
paper 



Case 1 t = 21. Applying the above identities, 

D 

W/+fc+l — U2l+2k+2 + C — Ut+2k+2 + C. 

On the other hand, 



(19) 



Ul+k+l Ul-k U2l-2k + C = Ut-2k + C Ut+2fc+l + C (20) 

^3 ^3 Ut-|_ 2 fc+i = Ut+ 2 k +2 which contradicts with 1^3- Thusper([/) = 

2d. 

Case 2 t = 2^ — 1. 



Ul+k+l — U2l+2k+2 + C = Ut+2k+l + C. 



( 21 ) 



Ul+k+l = Ul-k-l = U2l-2k-2 + C = Ut-2k-l + C = Ut+2k+2 + C (22) 
and ^3 Ut+ 2 k+i = Ut+ 2 k +2 which contradicts with 1^3- Thusper(C/) = 



2d. 



Lemma 5 Let S = {Sq^Si)'^ be a EC-sequenee of length 2v, generated by a 
supersingular elliptic curve E{¥q), where u|(|i?| — 1) and 0 < w{Sq) < v. Then 
per{So) = V. 

Proof. From Theorem^ we have Sq = (A, A), where length{A) = v/2. Together 
with Lemmafl So is a CFP sequence of length v. Since 0 < w{Sq) < v, applying 
Lemma3 we get per{So) = v. 

Lemma 6 Let S = {Sq,Si)’^ be a EC-sequence of length 2v, generated by an 
elliptic curve E{¥q), where u|(|if| — 1). Then per (S) is an even number. 

Proof. Assume that per{S) = 2t + 1. Then we have si = S 2 t +2 = ^t+i and 
5„_t+i = S 2 „_ 2 (t+ 1 ) = Si =1^ by-t+i = bt+i. From TheoremO = ^t+i + 1 

which is a contradiction. So, per{S) is even. 

Theorem 3 Let S = (50,51)^ be a EC-sequence of length 2v, generated by a 
supersingular elliptic curve E{¥q), where u|(|if| — 1) and 0 < w{Sq) < v. Then 
per{S) = 2v. 

Proof. Since length{S) = 2v, then per{S)\2v. According to LemmaB per{S) = 
2t where t\v. Assume that t < v. Then 



— ^2{t-\-j) — l — ^2t+2j — l — S2j — 1 — ^j:j — 1; ^5 ‘ ‘ ■ 

Thus, t is a length of Sq per{So)\t . According to Lemma3 per{So) = v. 
Thus t = per{So) = v per{S) = 2v. 
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Corollary 2 Let n he odd. Let S = {Sq, Si)’^ be a EC-sequence of length 2v, gen- 
erated by a supersingular elliptic curve E(¥q), where — 1). Then per (S) = 

2v. 

Proof. From Theorem 4, we have 0 < w{Sq) < v. Applying Theorem 5, the 
result follows. 



6 Linear Span of Snpersingular EC-Sequences 

In this section, we derive a lower bound and an upper bound on the linear span 
of the EC-sequences generated by snpersingular elliptic curves in the isomorphic 
class El. For convenience in using Proposition 1, from now on we will write 
S, So and with the starting index at 0, i.e., S = (sq, si, • • • , S2»»+i_i), So = 
(ao, oi, • • • , a2«-i) and = (bo, bi, - ■ ■ , 62n-i) (v = 2” in this case). So, 

Oi — S2i, i — d, 1, * * * , 
bi — S2i-\-l ,i — di I 5 ‘ ‘ ■ 



Lemma 7 Let U = (uo, ■ ■ ■ , U2'=-i) where per(U) = 2^ and w(U) = 0 mod 2. 
Then, the linear span of U , LS(U), is hounded as follows: 

2'=-! < LS(U) < 2^= - 1 

Proof. Let h(x) be the minimal polynomial of U over F2. Let f(x) = x^ +1, 
then f(L)(S) = 0. Thus h(x)\f{x). Since 

/(:r)=:r2'‘ + l = (:r + l)2^ 

we have h(x) = (x + 1)* where t is in the range of 1 < t < 2^. Since w(U) = 0 
mod 2, let p = 2^, we have 



p-i 

^p+j ~ 'y j ~ d , 1 , 



g(x) = ^ characteristic polynomial of U over F2. So h(x)\g(x) 

LS(U) < 2^= - 1. 

On the other hand, if r < 2^“^, then h(x)\(x 1)^ = x"^ +1 

x'^ + 1 is a characteristic polynomial of U over F2 

(L = ^2^— 1_|_2 4“ Ui = 0, z = 0, 1, * * * 

per(U)\2^~^. This contradicts per(U) = 2^. So, r = LS(U) > 2^“^. 



Theorem 4 Let n be odd. Let S be an EC-sequence of length 2v, generated from 
a supersingular elliptic curve E{¥q) which is isomorphic to -\- y = x^, where 
V = \E\ — 1. Then 

2" < LS(S) < 2(2" - 1). 
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Proof. From Corollary ^ we have per(S) = 2"+^. According to Theorem H 
w{S) = 0 mod 2. So, S satisfies the conditions of LemmaH Applying LemmaH 

2" < LS{S) < 2"+^ - 1. 

Now, we only need to prove that LS{S) < 2(2" — 1). Let m{x) and mo{x) be 
the minimal polynomials of S and Sq over F 2 , respectively, where S = {Sq, Si)’^ . 
According to Proposition^ we have 

m(a;)|mo(a;^) deg{m{x)) < 2deg(mo(x)). 

Since Sq also satisfies the condition of LemmaH we get deg{mo{x)) = LS{Sq) < 
2" - 1. So, 

LS{S) = deg{m{x)) < 2deg{mo{x)) < 2(2" — 1). 

7 Applications 

In this section, using the theoretical results that we obtained in the previous 
sections, we construct a class of EC-sequences with large linear spans and small 
bias unbalance, point out its implementation and give a comparison of ECPSG 
I with other known pseudorandom sequence generators. 

7.1 ECPSG I 

(a) Choose a finite field K = F 2 »» where n is odd 

(b) Randomly choose a super singular curve E : y'^ + y = x^ + C4X + cq over F 2 " 
in the isomorphism class Ei of the curve y^ + y = x^. (|Ei| = 2 ^"“^.) 

(c) Randomly choose a point P = (x,y) on the curve E such that the order of 
P is 2" + 1. 

(d) Compute iP = {xi, yi), i = 1, • • • , 2". 

(e) Map iP into a binary pair by using the trace function 

Oi = Tr{xi) and bi = Tr{yi) 

(f) Concatenate the pair (a^, hi) to construct the sequence S = (oi, 61 , 02 , 62 , • • • , 
02 -, 62 "). 



G{Ei) = {S' = |si}|S generated by E(F 2 ") G Ei}. 

G{Ei) is called an elliptic curve pseudorandom sequence generator of type I 
(ECPSG I). Any sequence in G{Ei) satisfies that per{S) = 2"+^, w{S) = 2"±2"* 
and 2" < LS(S) < 2(2"- 1). 

Example Let n = 5. 

(a) Construct a finite field F 25 which is generated by a primitive polynomial 
f{x) = a::® + + 1. Let a be a root of f{x). We represent the elements in 

F 25 as a power of a. For zero element, we write as 0 = a°°. 
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(b) Choose a curve E ■. + y = . 

(c) Choose P = {a, with order 33. 

(d) Compute iP = {xi,yi), i = 1, ■ ■ ■ , 32, and the exponents of a for each point 
iP are listed in Table 1. 



Table 1. {iP} 




(e) Map the point iP into two bits by the trace function: 
x-coordinate sequence 

{oi = Tr{xi)} = 00101110110111100111101101110100 
and y-coordinate sequence 

{bi = Tr{yi)} = 01101001101101101001001001101001 

(f) Interleave (ci, 5i): 

S = (Oi, 5i, 02, 62, • • • , 032, 632) 

= 0001110011101001111001111011110001101011100011100011111001100001 
According to Theorems and J we have 

— per{S) = 64. 

— w{S) = 2® + 2^ = 36. The bias of unbalance is equal to 4 for S. 

— Linear span: 32 < LS{S) < 62. 

Remark 4 1. The actual linear span of S is 62 and it has the minimal poly- 

nomial m{x) = (a; + 1)®^. 

2. The linear span of a periodic sequence is invariant under the cyclic shift 
operation on the sequence. We computed the supersingular EC-sequences over 
F 25 and F 27 for all phase shifts of the sequences. Experimental data shows 
that the profile of linear spans of any supersingular EC-sequence increases 
smoothly for each phase shift of the sequence. 

7.2 Implementation of ECPSG I 

Implementation of ECPSG relies only on implementation of elliptic curves over 
F 2 »«, we can borrow software/ hardware from elliptic curve public-key cryptosys- 
tems to implement ECPSG. 
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7.3 A Table 

In Table 2, we compare the period, frequency range of 1 occurrence, unbalance 
range, and linear span (LS) of ECPSG I with other sequence generators, such as 
filter function generators (FFG), combinatorial function generators (GFG), and 
clock controlled generators (GGG). We also include data for de Bruijn sequences. 
We conclude that FGPSG I may be suitable for use as a key generator in a stream 
cipher cryptosystem. 



Table 2. Gomparison of FGPSG I with Other Sequence Generators 



Type of 
Generator 


Period 


Frequency Range 
of 1 occurrence 


Unbalance 

Range 


Linear 

Span 


FFG 


2" - 1 






unclear 


GFG 


< 2" - 1 


ll,2"-^J 


11,2"-^J 


unclear 


GGG 


(2" - ly 


2n-l(2n _ 


2" - 1 


n(2" - 1) 


de Bruijn 


2"+' 


2" 


0 


> 2" -b n -b 1 
< - 1 


FGPSG I 


2"+i 


2^ 2("-i)/^ 




> 2" 

< - 2 
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Abstract. In previous work, security results of decorrelation theory was 
based on the infinity-associated matrix norm. This enables to prove that 
decorrelation provides security against non-adaptive iterated attacks. In 
this paper we define a new matrix norm dedicated to adaptive chosen 
plaintext attacks. Similarly, we construct another matrix norm dedicated 
to chosen plaintext and ciphertext attacks. 

The formalism from decorrelation enables to manipulate the notion of 
best advantage for distinguishers so easily that we prove as a trivial 
consequence a somewhat intuitive theorem which says that the best ad- 
vantage for distinguishing a random product cipher from a truly random 
permutation decreases exponentially with the number of terms. 

We show that several of the previous results on decorrelation extend 
with these new norms. In particular, we show that the Peanut construc- 
tion (for instance the DFC algorithm) provides security against adaptive 
iterated chosen plaintext attacks with unchanged bounds, and security 
against adapted iterated chosen plaintext and ciphertext attacks with 
other bounds, which shows that it is actually super-pseudorandom. 

We also generalize the Peanut construction to any scheme instead of 
the Feistel one. We show that one only requires an equivalent to Luby- 
Rackoff’s Lemma in order to get decorrelation upper bounds. 



Since the beginning of conventional cryptography, theory on the formal se- 
curity of encryption algorithms hardly got foundations. Decorrelation theory 
enables to deal with randomness and d-wise independence in connection with 
security. This provides a way for proving the security against restricted attacks. 
Other approaches treats unconditional security for encryption in a group struc- 
ture (see Pliam ^]). 

Decorrelation theory provides new directions to design block ciphers with 
provable security against some classes of standard attacks. Decorrelating to an 
order of d a block cipher Ck which depends on a random key K roughly consists 
in making sure that for all d plaintexts (a;i, . . . , Xd), the corresponding cipher- 
texts {Ck{xi), . . . , CK(xd)) are uncorrelated. This way, decorrelated functions 
are generalizations of Maurer’s locally random functions Q. Although the notion 
of decorrelation is quite intuitive, there is no formal definition of it, but instead 
several ways to measure it. Decorrelation theory has usually four tasks. 

* Part of this work was done while the author was visiting the NTT Laboratories. 

Howard Heys and Carlisle Adams (Eds.): SAC’99, LNCS 1758, pp. 49-^^2000. 
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1. Defining a measurement for the decorrelation. This usually relies on a matrix 
norm. 

2. Constructing a simple primitive (also called “decorrelation module”) with a 
quite good decorrelation. 

3. Constructing cryptographic algorithms with decorrelation modules in such a 
way that the decorrelation of the primitive can be inherited by the algorithm. 

4. Proving that the decorrelation provides security against classes of attacks. 

jn these issues have been treated with the infinity-associated ma- 

trix norm (denoted |||.|||oo)- In particular, it was shown that this norm corre- 
sponds to the best advantage of a non-adaptive chosen plaintext attack. The 
present paper proves the results, but with a quite non-intuitive norm which 
corresponds to the best advantage of adaptive chosen plaintext attacks, and of 
adaptive chosen plaintext and ciphertext attacks. In particular we show that 
previous results on Peanut constructions extend to this setting. In particular, 
DFC has the same provable security against adaptive iterated chosen plaintext 
and ciphertext attacks. 

This paper address the first three tasks, but hardly deals with the fourth one 
which may deserve further research. 



1 Previous Results 

The goal of decorrelation theory is to provide some kinds of formal proof of 
security on block ciphers. Earlier results was due to Shannon Q (who show 
the limits of unconditional security) and Luby and Rackoff Q (who show how 
the randomness theory is applicable to provide provable security) . Decorrelation 
theory is mainly based on Carter- Wegman’s universal hashing paradigm As 
was shown by Wegman and Carter this enables to provide provably secure 
Message Authentication Codes. 

Results on decorrelation have first been published in STACS’98 ^3]|ln this 
paper, decorrelation bias was formally defined. 

Definition 1. Given a random function F from a given set M.\ to a given set 
M 2 and an integer d, we define the “d-wise distribution matrix” [F]‘^ of F as a 
Mfx Ai^-matrix where the (x, y) -entry of [F]'^ corresponding to the multi-points 
X = {xi , . . . , Xd) G Aif and y = {yi, . . yd) G Ai^ is defined as the probability 
that we have F{xi) = yi for i = 1, . . . ,d. 



Definition 2. Given a random function F from a given set Ai\ to a given set 
Ai 2 , cm integer d, and a distance D over the matrix space R ^ , we define 

the “d-wise decorrelation bias of function F” as being the distance 

DecF^(F) =D([F]'^, [F*]'^) 

^ A more complete version (with some error fixed in it) is available in 
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where F* is a uniformly distributed random function from Mi to M 2 - Similarly, 
for Ml = M 2 , if C is a random permutation over Mi we define the “d-wise 
decorrelation bias of permutation C ” as being the distance 

DecP^(C) = D{[C]'^, 



uniformly distributed random permutation over Mi- 

infinity-associated matrix norm denoted |||.|||oo and defined by 

|||A|||oo = max^ \AiJ 

col j 

was considered. For an injection r from {0, 1}"* to GF(g) and a surjection tt from 
GF(g) to {0, 1}"*, it was shown that the random function F defined on {0, 1}"* 
by 

Fix) = 7T (r(ATo) + riKi)x -I- . . . -f 

for (AToj ■ ■ ■ j Kd-i) uniformly distributed in {0, 1}'^"* provides a quite good decor- 
relation. Namely, 

DecF|j|.|||^(F) <2(9'^.2— 

This construction is called the “NUT-IV decorrelation module” on since 
there are three other ones. 

It was shown that this decorrelation could be inherited by a Feistel network Q 
in a construction called “Peanut” . Namely, when the round functions of an r- 
round Feistel network (r > 3) has a d-wise decorrelation bias less than e, the 
d-wise decorrelation bias of the whole permutation is less than 

(3e + 3e2 + e3 + ^2 2i-^)L§J , (1) 

It was also shown that decorrelation to the order 2 provides security against 
differential and linear attacks. 

In SAG’98 Q, the Euclidean norm (denoted II.H 2 ) was proposed, and it was 
shown that the same results hold for the d = 2 case with other upper bounds. 
These bounds unfortunately provide worse bounds than for the |||.|||oo ones, but 
are applicable to the following decorrelation module for which the |||.|||oo is not: 



where C* is a 




F{x) = {Kq -I- Kix) mod p 



“DFG” 



was 



with {Kq, Ki) €(7 {0, . . . , 2"* — 1}^ and a prime p < 2"*. 

Based on the Peanut construction, an algorithm called 
submitted to the Advanced Encryption Standard process. 

In Eurocrypt’99 P], the family of iterated attacks of order d was considered. 



It was shown that decorrelation to the order 2d provides security against iterated 
attacks of order d. 
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2 A New Decorrelation Measurement Dedicated to 
Adaptive Attacks 

Let us consider a distinguisher A which is limited to d queries to an oracle O. 
Its computation power is unlimited, and its output (0 or 1) can be probabilistic. 
Its aim is to distinguish if O implements a random function Fi or a random 
function i^ 2 - For this we consider the advantage 

Adv^(Fi, Fa) = |Pr = l] - Pr [A^=^^ = l] | . 

We say that A is non-adaptive if all queries can be sent simultaneously to 
the oracle (in particular, no query depend on the answer to a previous query). 
A well known result shows that the largest advantage of a non-adaptive chosen 
plaintext attack corresponds to the IH.IHoo norm of [Fi]'^ — [Fa]*^. Namely, we 
have 

max Adv-^(Fi,Fa) = i|||[Fj‘^-[Fa]'^||U. 

A non — adaptive ^ 

chosen plaintext 
d — limited 

We adapt this result in order to define a new norm which will be denoted 

IMIa. 



Definition 3. Let Aii and Ada be two sets, and d be an integer. For a matrix 
A e uje define 

||A||a = max Vmax V . . .max V 

X\ X‘2 Xd 

yi V2 yd 



Theorem 4. For any random functions F\ and Fa from a set Adi to o set Ada 
and any integer d, we have 

inax Adv^(Fi,Fa) = i||[Fi]‘^ - [Fa]‘^|U. 

A distinguisher ^ 

d — limited 

Proof. Let A be a distinguisher. It first queries with a random Ai (where the 
randomness comes from A only), then get a random Yi (whose randomness also 
comes from O). Then it queries a random Aa which depends on Ai and Y 2 , and 
get a I 2 , ••• At the end, A answers a random value A = 0 or 1. We have 

Pr [A° = 1] = ^ Pr[xi] Pr[yi/a;i] . . .Pr[A = l/a;i . . .j/d]. 

xi,yi,...,Xd,yd 

Let Pi = Pr = 1] . Since the randomness of A and Ft are independent, 

we have 

Pi= YI Pr[a;i]Pr[a;2/a;i,|/i]...Pr[A= l/a;i...yd][Fi];^_j^ 
xi,yi,...,Xd,yd 
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where x = (a;i, . . . , Xd) and y = (j/i, . . . , yd). This is a sum of terms of the form 
Pr[a;i]/(a;i). Obviously, the advantage is maximal when Pr[a:i] = 1 for the max- 
imal f{xi). Actually, we show that this sum is maximal for some deterministic 
distinguisher in which Xj is a function oi yi, . . . ,yj-i only. We have 

Pi - P2 = '^0,y - [^2]f,y) 

y 

where Oy is 0 or 1 . Obviously, this difference is maximal if Qy is 1 for the positive 
terms, and 0 for the negative terms. We notice that the sum of all terms is 0. 
Hence we have 

y 

when it is maximal. The choice of x which maximizes this sum completes the 
proof. □ 

In order to deal with decorrelation biases, it is pleasant to have matrix norms, 
i.e. norms such that ||A x H|| < ||A||.||i?||. If we have such a norm, we actually 
have the following property 

DecP|j.||(CioC 2 ) <DecP|j.||(Ci).DecP|j. 11 ( 02 ) (2) 

and the same for DecF||.||. The following result says it is applicable for the ||.||a 
norm. 

Theorem 5. ||.||a is a matrix norm. 

Proof. We make an induction on d. Let A be a matrix in R^i To each 
xi G Ml and each X 2 € M 2 we associate a submatrix (A) in R^i 
defined by 

{x2,...,xd),{y2,...,yd) ~ 

These submatrices actually define a matrix 7 t(A) which is basically a different 
way of viewing A. We have the following property which links the corresponding 
norms for the parameters d and d — 1 

||A||a = max^||7r„,,y,(A)||a. 

Xi 

yi 

Let A and B be two matrices. We have 

||A X H||a = max^ ||7ra;i,yi(A x B)\\a. 
yi 

Straightforward computations show that 

ti 
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Thus by induction we have 

11^ X B\\a = max^^ ||7T^,,t,(A) x Trt,,y,{B)\\a 

Xi 

VI *1 

< max^^ ||7r^,,t,(A)||a.||7rtj,y,(B)||a. 

VI *1 

This last expression is actually a 1 1 1 ■ 1 1 1 oo norm of the product of two matrices A' 
and B' defined by 

and 

)ti,Vl ~ ll’’’tl,yi(^)l|a- 

We already know that |||.|||oo is a matrix norm. Therefore we have 

||Axi?|U<|||^'xi?'||U 

<lll^1 l|oo.|||i?1||oo 

which is ||^||a.||5||a- □ 

As in this theorem implies the following properties. 

Corollary 6. For any random function Fi,...,F 4 , if F* denotes a random 
function with uniform distribution, the following properties hold. 

DecF|j.||^(Fi o F2) < DecF|j.|| jFi).DecF|j.||^(F2) ( 3 ) 

||[FioF2]‘^-[FioF3]'^|U<DecFfj.||jFi).||[F2]‘^-[F3]‘^|U (4) 

1 1 [Fi o F 2 ] - [F 3 o F 4 ] I U < DecFjj . 1 1 jFi ) . 1 1 [F 2 ] - [F 4 ] I U 

+DecFfj.||jF4).||[Fi]'^-[F3]'^|U (5) 

Similar properties hold for permutations. 

We outline that Equation Q means that if Fi and F 2 are two independent 
random functions with the same distribution and if a is the best advantage of a 
d-limited distinguisher between Fi and a uniformly distributed random function 
F*, then the best advantage of a d-limited distinguisher between Fi 0 F 2 and F* 
is less than 2a^. Similarly, for r rounds, the best advantage is less than i(2a)’': 
the advantage decreases exponentially with the number of rounds. 

3 On the NUT-IV Decorrelation Modnle 

The DFC algorithm is a Peanut construction which uses the following decorre- 
lation module. 

F{x) = {Ax + B) mod (2®^ -|- 13) mod 2®^ 

where {A,B) Gjj {0,...,2®^ — 1}^. This is a particular case of the NUT-IV 
decorrelation module for which we prove the same bound for its decorrelation 
bias, but with the ||.||a norm. 
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Theorem 7. For an integer m, let q = 2"*(1 -|- (5) be a prime power with (5 > 0. 
We consider an injection r from {0, 1}"* to GF{q), and a surjection tt from 
GF(g) to {0, 1}"*. We define the following random function on {0, 1}"*. 

F{x) = 7T (r(Ao) -k r{Ai).r{x) -k . . . -k r{Ad-i).r{x)‘^~^) 

where (Aq, . . . , Ad-i) G {0, 1}"*'^. We haue 

DecFfj.||jF)<2((l + 5)‘^-l). 

Proof. We adapt the proof of Q . We let F* be a uniformly distributed random 
function. In the computation of let X\, X 2 = f 2 {ui), ■ ■ ■ Xd = 

/d(yi, ■ • ■ , t/d-i) such that 

v={yi,...,Vd) 

is maximal, where x = (a;i, . . . , Xd). 

For some terms in the sum, some Xi may be equal to each other. For this 
we need to make a transformation in order to assume that all XiS are pairwise 
different. For any (x, y) term, let c be the total number of different XiS. Let 
cr be a monotone injection from {1, . . . , c} to {1, . . . , d} such that all Xa-(i) are 
different. We notice that if Xi = xj, we can restrict the sum to pi = pj (because 
the other terms will be all zero). We thus still have a;' = ^^(i) = f'iiPi, ■ ■ ■ , Pi-i) 
where y' = Pa(i), and all a;' are pairwise different for i = 1, . . . , c. We can now 
define a^^+i , ,x'j^ with some new arbitrary functions , . . . , /^ in such a way 
that all a:' are pairwise different. We have 



jF) = E E mi'.y'-mi'.y') 

y’i,---,y'a y'^+i,---,y'd 

< E 

y'l’-’Vd 

where x' = {x \, . . . , x'^ and y' = (y(, . . . , y'^. Hence we can assume without loss 
of generality that all XiS are pairwise different. 

Obviously, [F]^ y can be written j.2~"^‘^ where j is an integer. Let Nj be the 
number of y such that [F]'^ y = We have 



DecFfj 



+00 



(F)<E^alj-l|2 



— md 



i=o 



+CXD 



= 2Nq. 2-'^'^ + ^Njj.2 
j=o 



— md 



+00 



3=0 






The first sum is equal to 
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which is equal to 1. The second sum is 2 times the total number of y, which 
is also 1. Thus we have DecFjj n^(T’) < 2Nq.2~^'^. 

Let A be the set of all (x, y) such that \F\^y = 0. Let B be the set of all 
(ao, . . . , ad-i) in GF(g)'^ such that for at least one j we have aj ^ r({0, 1}"*). 
From usual interpolation tricks, we know that for any (x, y) in A there exists at 
least one (oq, . . . , aa-i) in GF{qy such that 

7r(ao + ai.r{xj) + . . . + ad-i.r{xj)'^~^) = yj 

for j = 1, . . . ,d. Since [F]'^ y ~ ^ must be in B. Furthermore this mapping 
from ^ to i? must be an injection. Hence Nq, which is the cardinality of A is 
less than the cardinality of B which is q'^ — □ 



4 Decorrelation of Peanut-Like Constructions 



We show here that the decorrelation of internal decorrelation modules in a cipher 
can be inherited by the whole scheme. 

Lemma 8. Let d be an integer, and Fi, . . . , F^ be r random funetions whieh 
are use in order to define a random function f2{Fi, . . . , Ff). We assume that 
the fl structure is such that for any x, computing f2(F'i, . . . , Fr){x) requires ai 
computations of Fi for i = 1, ... ,r. We have 

r 

||[f2(Fi, . . . , F,)]^ - [n{Ff, ..., F:)]% < EDecF“:f| JF,) 

i=l 



where Ff,..., Ff are uniformly distributed random functions. 



Proof. By triangular inequalities, we have 



W[f2{F,,...,Fr)]‘^-[f2{Ff, 

r 






Z7’. rp* 

, . . . , X’25 -^2-1-15 • • • 5 



F^)Y 



2=1 



From Theorem Q each term corresponds to the best distinguisher between 
l7(Fi, . . . , Fi_i, F*, . . . , F^) and f2{Fi, . . . , Fi, F*^i, . . . , F^). This attack can be 
transformed into a distinguisher between Fi and F* by simulating the other 
functions. Hence this attack cannot have an advantage greater than the best at- 
tack for distinguishing Fi from F* with the same number of queries. The number 
of queries for this attack is at most aid. By applying back Theorem^ we obtain 
the result. □ 



This lemma can be considered as a “meta-theorem” which is applicable 
to any product cipher construction. For instance, for the Feistel construction 
F{Fi, . . . , Fr), we have = 1 for all i. The Peanut construction consists of 
picking decorrelated modules as round functions. In order to finish to estimate 
the decorrelation of Feistel structures, we need a lemma in order to estimate the 
decorrelation of Feistel ciphers with truly random functions. This is precisely 
the Luby-Rackoff | Lemma. 
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Lemma 9 (Luby Rackoffl988). Let F*, F 2 , L 3 be three random function on 
{0, 1}^ with uniform distribution. We have 

DecP|j.||Jf(F*, < 2d".2-^. 

(This is a straightforward translation of the original result by using Theorem^) 
We can thus upper bound the decorrelation bias in a Peanut construction. 

Corollary 10. If Fi,...,Fr are r random function (r > 3) on {0,1}^ such 
that DecFjj ||_^(Fi) < e, we have 



DecP|j.||Jf(Fi,...,F^)) < (3e-k 2^2.2-^) LSJ. 



We note that this slightly improves Equation taken from | 



Proof. Since the best advantage cannot increase when we make a product of 
independent ciphers, LemmaH holds for any Feistel cipher with at least three 
rounds. We write . . . , F^.) as a product of [|J Feistel ciphers with at least 
3 rounds. We apply Lemma Hand Lemma Hto each of it, and we finally apply 
Equation H . □ 

As another example, we mention this lemma taken from 

Lemma 11 (Patarin 1992). Let f be a permutation on {0, 1 }^ such that for 
any y there exists at least A values x such that x 0 f{x) = y. Given a uniformly 
distributed random function F* on {0, 1}"2' and an integer d, we have 



DecPjj (F*, F*, F* o C o F*)) < 13d^2"^ 0 2Ad2-^ 



Corollary 12. Let f be a permutation on {0, 1}^ such that for any y there 
exists at least A values x such that x 0 ('(a;) = y. Given independent random 
functions Fi, . . . , F,. on {0,1}^ and an integer d, we have 

DecPfj.||_^(F(Fi, Fi, Fi o ( o Fi, . . . , F^, F^, Fr o ( o F^)) 

< (l3d^2“T + 2\d2~^ + eY 

where e = max^ DecFj^‘^||^(Fi). 

Other product constructions require an equivalent to LemmaH For instance, 
the Lai-Massey scheme which is used in IDEA has an equivalent result. (See 

Do 



5 Super-Pseudorandomness 

We now address the problem of d-limited adaptive chosen plaintext and cipher- 
text distinguishers. Since the proofs are essentially the same, we do not give all 
details here. We first define the corresponding norm. 
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Definition 13. Let Aii and M 2 be two sets, and d be an integer. For a matrix 
A € R-^ 1^^2 Hie let denote the matrix in R-^i ^-^2 defined by 

i'^xi,yi{A))(^x2,...,xd),{y2,...,yd) ~ -'^{xi,...,xd),{yi,...,yd)- 



By induction on d we define 



1 1 Alls = max 




(A)||s,max 

yi 



'y ^ I Ka:i,y2 

xi 



(^)ll 



with the convention that ||A||s = |Aq_q| for d= 0. 

Since chosen ciphertext makes sense for permutation only, all the following re- 
sults hold for permutations. 

Theorem 14. For any random permutation Ci and C 2 over a set M and any 
integer d, we have 



max 

distinguisher 
chosen plaintex and ciphertext 
d — limited 



Adv-^(Ci,C2) = -||[Ci]‘^-[C2]“ 



The proof is a straightforward adaptation of the proof of Theorem^ 
Theorem 15. ||.||s is a matrix norm. 

For this proof, we adapt the proof of Theorem^ and notice that 

niax^ \Mx,,y,\ = \\\M\\\i 

Xi 



where IH.IHi is the matrix norm associated to the Li vector norm. Therefore 
Corollary Jholds for the ||.||s norm with permutations. 

We can also extend LemmaH 

Lemma 16. Let d be an integer, F\,...,Fr be r random functions and let 
Cl, . . . , Cs be s random permutations which are used in order to define a random 
permutation C = L2{Fi, . . . , Fr,C\, . . . , Cs). We assume that the fl structure is 
such that for any x and y, computing C(x) or C~^{y) requires Oi computations 
of Fi for i = 1, . . . , r and bt computations of Ct or C~^ for i = 1, . . . , s. We 
have 



1 1 [n{Fi, ...,Fr,Ci,..., Cs)]'^ - [n{Fl ..., Ff, Cl ..., c:)]‘^| 









where , F* are uniformly distributed random functions and C*, . . . , C* are 

uniformly distributed random permutations. 
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For instance, in the Peanut construction, we have s = 0 and at = 1 for all i. 
However LemmaHis not applicable with the ||.||s norm. We thus use a similar 
lemma for 4-round Feistel ciphers. 

Lemma 17 (Luby— Rackoff 1988). Let F*, Fg , FI be four random func- 
tion on {0,1}“ with uniform distribution. We have 

DecPfj.||jF(F},F*,F3*,F4*)) < 2d\2~^. 

We can therefore measure the decorrelation in the sense of 1 1 . 1 1 s of Peanut con- 
structions. 

Corollary 18. If Fi,...,Fr are r random function (r > A) on {0,1}^ such 
that DecFjj ii^(Fi) < e, we have 

DecP;j.|l^(F(Fi,...,F^)) < (4e -k 2d^.2-^) LJJ . 

This shows how much super-pseudorandom a Peanut construction is. 

6 Security by Decorrelation with the New Norms 

We already know that the decorrelation with the 1 1 1 . 1 1 1 oo norm enables to prove 
the security against differential, linear distinguishers and non-adaptive chosen 
plaintext iterated attacks. Since we have ||.||s > IMIa > HMHoo, all these results 
are applicable to the decorrelation with the ||.||a and ||.||s norms. 

From the proofs in it is quite clear that all results on iterated attack 
extends to 1 1 . 1 1 a-decorrelation when each iteration is adaptive, and to ||.||s- 
decorrelation when they can use chosen ciphertexts in addition. 

One open question remains from the iterated attacks results. In it was 
shown that the security proof requires some assumption on the distribution of 
the queries to the oracle. This was meaningful when we addresses the known 
plaintext non-adaptive attacks. But now adaptive attacks are chosen plaintext 
in essence. It thus remains to improve the results from in order to get 
provable security against these attacks. 

7 Conclusion 

We have shown which matrix norm adaptive attacks and chosen plaintext and 
ciphertext attacks was related to. These norms define a much stronger notion of 
decorrelation. We have shown that previous upper bounds on the decorrelation 
extends to these new norms, in particular for the Peanut construction and the 
NUT-IV decorrelation module. We also generalized the Peanut construction to 
any scheme which is not necessarily a Feistel one. We have shown that if it is a 
product scheme, then we can upper bound the decorrelation of the whole scheme 
from the decorrelation of its internal functions, provided that we can extend the 
Luby-Rackoff Lemma to this scheme. 

Our formalism happens to be practical enough in order to make trivial the 
exponential decreasing of the best advantage of a distinguisher between a product 
cipher and a truly random cipher. 
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Abstract. Absolute lower limits to the cost of cryptanalytic attacks are 
quantified, via a theory of guesswork. Conditional guesswork naturally 
expresses limits to known and chosen plaintext attacks. New inequalities 
are derived between various forms of guesswork and variation distance. 
The machinery thus offers a new technique for establishing the secu- 
rity of a cipher: When the work-factor of the optimal known or chosen 
plaintext attack against a cipher is bounded below by a prohibitively 
large number, then no practical attack against the cipher can succeed. 
As an example, we apply the technique to iterated cryptosystems, as the 
Markov property which results from an independent subkey assumption 
makes them particularly amenable to analysis. 



1 Introduction 

Research on provably secure ciphers often focuses on specific cipher properties 
or resistance to specific families of attacks (see e.g. D ^3)- When 
general attacks are considered, the adversary’s resource limitations are typically 
built into the equation. In the Luby-Rackoff model (see ^3 or more recently 
^3); the adversary is assumed to have bounded computational resources. In the 
Decorrelation Theory of Vaudenay (see e.g. ^3 the references in ^3)> the 
adversary may have restricted data complexity (such as a bound on the num- 
ber of plaintext-ciphertext pairs) or may be carrying out a constrained attack 
(such as Differential Cryptanalysis Q). In this paper, we summarize a different 
approach to provable cipher security which is developed more fully in ^3- Our 
approach is to model a cipher as a group-valued random variable — following 
Shannon — and derive absolute lower limits on the work-factor for discovering 
its secret key. 

This technique naturally applies to product ciphers and iterated cryptosys- 
tems. Figure^below depicts a hypothetical security profile for the behavior of 
the product of finitely many ciphers as a function of the number of terms. In 
order to begin to quantify this profile, we must find a meaningful measure of 
security for which establishing the profile’s shape in certain places is a tractable 
problem. Our primary interest is in the non-asymptotic shape of the curve — 
because iterated cryptosystems cannot iterate forever. 
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security 




Fig. 1. A hypothetical profile of security as a function of the number of terms in 
a product, or equivalently the number of rounds in an iterated cryptosystems, 
assuming subkey independence. 



We posit that a reasonable security measure is the expected work involved in 
“guessing” the cipher’s key from the set of all keys which remain consistent with 
acquired plaintext-ciphertext pairs. We call this measure guesswork (or more 
generally conditional guesswork) and demonstrate its tractability by making use 
of techniques from the modern theory of random walks on symmetric structures. 

Starting in Sect. Hand continuing in Sect.H ^ formal theory of guesswork 
is developed which parallels information theory in a number of interesting ways. 
In particular, (logarithmically) tight bounds involving guesswork are derived 
in Theorem H and variation distance plays a role similar to Kullback-Leibler 
distance. With the help of these tools, we turn our attention in Sect.Hto quan- 
tifying the shape of the security profile of Fig.^ non- asymptotically. That is 
to say, rather than in some unknown neighborhood of the point at infinity, we 
establish provable security after a finite number of rounds. 

2 Preliminaries 

A basic familiarity with group theory Q, and random variables and probabil- 
ity spaces H assumed. We develop an abstract form of Shannon’s model of 
private key ciphers in which the invertible encryption functions are taken 
as elements of a group G. For a block cipher with a message space consisting 
of all n-bit strings, the group G is naturally seen as a subgroup of the symmetric 
group © ^ (whose elements consist of all permutations of 

2.1 Shannon’s Model 

Secret keys and messages must be considered random from the viewpoint of a 
cryptanalytic adversary. Thus, the eavesdropper on an insecure channel may be 
thought of as performing a probabilistic experiment in which the message and 
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key are values drawn at random according to certain probability distributions. 
It is assumed that the key is statistically independent of the message, and that 
the individual plaintext blocks of the message are statistically independent of 
one anothe J This allows the cipher itself to be treated as an independent ran- 
dom variable. Furthermore, we may dispense with a distinction between the key 
space and the group G generated by the encryption functions. Each possible key 
corresponds to an element of G, and any element of G which is not identified 
with a key is taken to have probability 0. We may now formally define a cipher. 

Definition 1. Given a finite message space ^ and a subgroup G < ©^, a 
C?-cipher or a cipher over G is a G-valued random variable. 

Shannon wrote down a cipher as a linear combination of encryption functions, 
with the coefficients taken to be the probabilities of the corresponding functions. 
He naturally defined the product of ciphers by merely enforcing the distributive 
laws. Shannon was essentially defining what is now commonly called the group 
algebra. The natural product in the group algebra is equivalent to a kind of 
convolution of probability distributions. 

2.2 Product Ciphers and Convolution 

Consider the situation where an encryption operation is the composition of two 
independent encryption operations. This leads to the formal notion of the prod- 
uct of two ciphers over a group. Let X and Y be independent G-ciphers with 
probability distributions x{g) = P[X = g] and y{g) = P [F = g\. The G-cipher 
Z = XY is called a product cipher, Y is called the first component and X is 
called the second component. 

Let us examine the distribution z{g) = P[Z = g\. 

z{g) = ^ P [X = gh-^ \Y = h]P[Y = h\ 
heG 

= Y,P[X = gh-^]P[Y = h] = Y.x{gh-^)y{h). 

heG heG 

Notice how much this last expression looks like convolution. In fact, if G were 
the abelian group of integers modulo n and the multiplicative notation were 
replaced by additive notation, z{g) would literally be the circular convolution of 
the functions x and y. So, z{g) is a kind of generalized convolution and will be 
written 

x*y{g) = ^x{gh~^)y{h). (1) 

heG 

Thus the distribution of a product is described by the convolution of the com- 
ponent distributions. Intuitively, convolutions “smooth out” distributions. 

^ Technically, plaintext blocks are typically independent only in the limit of large block 
length, but the emphasis here is on chosen plaintext attacks which are always faster 
than known plaintext attacks 
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2.3 Variation Distance 

Let be a finite set with probability distributions p and q defined on it. Recall 
d or I) the variation distance between p and q defined by 

lb-9ll = max Ip(^) - g(^)|. (2) 

It is a standard observation that variation distance is half of the ^i-norm, i.e. 

Ib-9ll = ^lb-9lli, 

and that the maximum in d is achieved on the set 

= {x € > q{x)} ■ 

If V is a G-cipher and u is the uniform distribution on G, then the closer 
px is to uniformity, the harder it will be for any adversary to determine the 
value of X. This general statement, which holds whether or not the adversary 
is in possession of plaintext-ciphertext pairs, is formalized in Theorem ^ and 
CorollaryHbelow. 

3 Guesswork: The Uncertainty of Guessing 

In this section, we develop the means to quantify fundamental statistical limits 
to the amount of work required to determine the value of a random variable. 
The notions of work discussed here have appeared before. In a broad sense they 
are intimately connected to Lorenz’s theory of wealth distribution (see also 
13) . Massey |3 was the first to formulate, in open cryptology, what we shall 
call the guesswork of a random variable! While it has correctly been pointed 
out (e.g. B) that guesswork is not a meaningful predictor of practical attack 
performance, we shall show that it is a very useful and tractable measure of the 
fundamental limits to practical attacks. 



3.1 Optimal Brute-Force Attacks 

Let be a finite set and suppose that X is the .^-valued random variable 
determined by probability distribution p. We may arrange iX so that the prob- 
abilities Pi = p{xi) satisfy 



Pi > P 2 > ■ ■ ■ > P\.3fr\- ( 3 ) 

^ We resist calling guesswork “guessing entropy” , as is done in because Theorem 
3 below is so closely analogous to Shannon’s First Theorem ^ that the natural 
analogue of guesswork is really the expected codeword length of Shannon’s theorem, 
as discussed in Remark | below. It perhaps makes more sense to call variation 
distance a kind of (relative) entropy, because it appears in the upper and lower 
bounds of Theorem J just as entropy does in Shannon’s theorem. Thus to call 
guesswork “guessing entropy” might lead to confusion. 
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Many situations in cryptology and computer security force an adversary to con- 
duct a brute-force attack in which the values of are enumerated and tested 
for a certain success condition. The only possible luxury afforded the adversary 
is that he or she may know which events are more likely. For example, UNIX 
passwords are routinely guessed with the aid of a public-domain software pack- 
age called crack which can be configured to test the most likely passwords 
first. The safest bet for the cryptographer is to assume that the adversary has 
complete knowledge of p and will conduct any brute-force attack optimally, i.e. 
in the order given by Q. This suggests the following definitions. 

Definition 2. Let X be an 3L -valued random variable whose probabilities are 
arranged aeeording to Q. The guesswork of X is given by 

\a-\ 

W{X)=Y,ip,. 

i=l 

The following simple algorithm demonstrates the computational meaning behind 
guesswork. The adversary is assumed to have access to the necessary optimal 
enumerator and an oracle which tells whether they have guessed correctly. 

Algorithm 1. Optimal brute-force attack against X which will always succeed 
and has expected time complexity 0{W{X)). 

input: (i). An enumerator of the values of SL in order of 
nonincreasing probability, (ii). An oracle which answers 
whether X = x. 
output: The value of X. 

for X G tX do 
if A = X then 
return x. 
endif 
done 

Clearly, the average computation time of Algorithm's W(X). Thus, guesswork 
may be interpreted as the optimal expected work involved in guessing the value 
of a random variable. 



3.2 Guesswork and Variation Distance 

It is easily seen that guesswork is bounded above by 

1V(A) < (4) 

and that equality is achieved if and only if X is uniformly distributed on iX 
(see ^3 or ^3) . The next theorem offers tight upper and lower bounds on the 
difference between guesswork and its maximum. The situation is analogous to 
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Shannon’s First Theorem in which the average codeword length (the thing you 
want to know) is bounded above and below by expressions involving entropy 
(the thing you can often compute). 

Theorem 1. Let 3L be a set of size n, and let X be an 3L -valued random 
variable defined by probability distribution p. Then, 

'^\\p-u\\<'^^i^-W{X) <n\\p-u\\. (5) 

The theorem is proved in j3| . Note that when \\p — u|| is sufficiently small, the 
upper and lower bounds of 

T) —I— 1 T) —I— 1 71 

- n\\p-u\\ < W{X) < ^ - - Up - „||, (6) 

are both positive. We see that as \\p — ii|| — > 0, W(X) approaches its maximum 
within increasingly tight bounds. 

Remark 1. For small values of the variation distance, Q admits the approxima- 
tion 

71 

W{X)^-{l-\\p-u\\). 

In this form, an analogy to Shannon’s First Theorem | is rather apt, because 
the optimal expected codeword length L* may similarly be approximated by 

L* Si log(n) - D{p\\u), 

where D{p\\u) = log(n) -I- H{p) is the Kullback-Leibler distance to uniformity, 
and H{p) is the entropy of p. Notice that \\p — u|j has a supremum of 1, while 
D{p\\u) has a maximum of log(n). 

Furthermore in Shannon’s theorem, the optimal codeword length is within 1 
bit of the entropy. However entropy is a logarithmic quantity relative to guess- 
work. In that sense, Q says that the cost of being naive in a non-optimal search 
is within 1 bit of the quantity log(n||p — u||). 

4 Security Measures for Known and Chosen Plaintext 
Attacks 

In this section we consider a cipher’s capacity for resisting known and chosen 
plaintext attacks. 

4.1 Conditional Guesswork and the Security Factors 

In a known or chosen plaintext attack, a single encryption key is used to encrypt 
a number of different plaintexts. An adversary who observes the corresponding 
plaintext-ciphertext pairs is privy to partial information about the key. In this 
section we quantify the intuitive notion that the resilience of a cipher against 
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known or chosen plaintexts attacks can be measured by the amount of work 
required to guess the key after information about the plaintext-ciphertext pairs 
has been taken into account. 

Formally if G is a subgroup of , let be a G-cipher representing a single 
choice of a cipher’s key. Again let x{g) = P [X = g], and let = (Pi , P 2 , . . . , Pf) 
be an Gtuple of i.i.d. random variables describing a sequence of distinct plaintexts 
in ^ . Following Q, let denote the set of Gtuples with distinct elements of 
^ . 6 ^ and hence G acts on in the natural way, namely a{mi , . . . , mi) = 
{ami , . . . , ami). Now define 

G^ = {XPi,XP2, . . . , XPi) = (Gi, G 2 , . . . , Cl). 

In other words, P^ and G^ are .y£ -vahied random variables. We write p,c G 
for instances of P^ and G^. When contemplating the loss of security due 
to observations of plaintext-ciphertext pairs, it is natural to define, by analogy 
to conditional entropy, notions of conditional guesswork. 

Definition 3. Given the quantities described above, the conditional guess- 
work of X given and P^ is defined as 

W{X\C\P^)= Y. W{X\C^ = c,P^=p)p,{c,p). 

The conditional guesswork of X given and that P^ = p is defined as 

W{X\C^,P^=p)= Y W{X\C^ = c,P^ = p)pMp)- 



Here pj{c,p) is the joint distribution ofC^ and P^, while pc{c\p) is the conditional 
distribution of given P^ . These are respectively given by 

Pj{c,p) = P [G^ = c, P^ = p] , and Pc{c\p) = P [G^ = c | P^ = p] . 

The two kinds of conditional guesswork will be used to quantify the perfor- 
mances of optimal known and chosen plaintext attacks, justifying the following 
definitions. 

Definition 4. The known plaintext security factor of X against the obser- 
vation of i plaintext- ciphertext pairs is defined as 

ui{X) = W{X\C^P^), 

and the chosen plaintext security factor of X against a choice of I plaintext- 
ciphertext pairs is defined as 

di{X)= min W(A|G^P'^ = p). 



Finally, we define vo{X) — 6 q{X) = W{X). 
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Notice that the chosen plaintext security factor is independent of the plaintext 
statistics, as one would expect. The next proposition establishes &e(X) as a 
principal figure of merit. 

Proposition 1. 9i{X) < vi(X). 

Proof. Simply expand out the formulas, 

,^e{x)= Y. w{x\c\p^ = p)pp{p)> Y ei{x)pp{p) = ei{x), 



where pp(p) = P \P^ = p\ ■ 



□ 



4.2 Observing Plaintext-Ciphertext Pairs 



An elementary observation about group action leads to a fundamental fact. Put 
simply, the guesswork W(X\C^ = c,P^ = p) is entirely determined by the dis- 
tribution of A on a coset of a certain subgroup of G. 

Lemma 1 (Coset Work Lemma). With X, C^,P^ and c,p G defined 

as before, there is a pcp G G such that c = PcpP- The conditional guesswork 

W{X\G^ = c,P^ = p), 



is determined by the distribution of X on the left coset PcpH , where P[ is the 
stabilizer subgroup StabG(p). 

Proof. The proof uses familiar group action observations discussed in Let 
Pep be the value of X. By definition c = gepP, and it is standard that 



{gGG\c = gp} = gep StabG(p) = gepH. 



If X is the random variable (X\G^ = c,P^ = p), we have 



P 




^(g) 

x(gcpH) ’ 
0 , 



if 5 G 9cpH, 
otherwise, 



by Bayes’s theorem. Now W(X) = W(XjG^ = c, P^ = p), which completes the 
proof. □ 



The coset work lemma suggests optimal algorithms for attacking a cipher X. 
Given (. plaintext-ciphertext pairs c,p G we may restrict the optimal 

search for the value of X in Algorithmjto a coset of H — StabG(p). Thus we 
have the following algorithms. 
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Algorithm 2 (Optimal Known Plaintext Attack). Defines algorithm'kpa. 

1. Collect into tuple p, i random plaintexts according to their 
natural statistics. 

2. Collect into tuple c, the corresponding ciphertexts. 

3. Invoke Algorithm^^to optimally search gcpStabcip) for the 
value of X, where c = gcpP. 

Algorithm 3 (Optimal Chosen Plaintext Attack). Defines algorithm cpa^. 

1. Letp minimize W{X\C^,P^ =P)- 

2. Let c = Xp. 

3. Invoke Algorithm^^to optimally search gStaha{p) for the value 
of X, where c = fip. 

The next proposition justifies the definitions of security factors izi{X) and 0i{X). 
See for a formal proof of this intuitive statement. 

Proposition 2. Under the assumption that the various oracles respond instan- 
taneously, the expected computation time of attacks kpa^ and cpa^ against X 
are given hy Vi{X) and 6i{X), respectively. 

4.3 Uniformly Distributed and (Conditionally) Perfect Ciphers 

Ciphers for which every achievable message permutation is equally likely have 
extraordinary properties, making them worthy of special attention. There is one 
such cipher for every subgroup G of the symmetric group Their security 

factors greatly simplify and can often be explicitly computed. When G = , we 

shall show that the resulting uniformly distributed cipher is perfect in meaningful 
ways. 

Definition 5. For any G < &^ , the uniformly distributed G -cipher denoted 
Ug is called the uniform G-cipher. In case G = &^, will simply be 

denoted U and called the perfect cipher. 

The coset work lemma admits an immediate simplification for uniform ciphers. 
Theorem 2. For every p G 

W(UgIG^, = p) = i (1 + IStabG(p)l) . 

Proof. For specific c,p G , the coset work lemma tells us that 
W{Ug\G^ = c, P^ = p) = W{Ug), 
where Ug is uniformly distributed on a coset oi H = StabG(p)- By 

W{UG) = \{l+\gepH\) = ^{l+\H\), 

which is independent of c. The desired result follows. □ 
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For a uniform cipher, we immediately have that the chosen plaintext security 
factor is a function only of the size of the smallest ^-message stabilizer. 



Corollary 1. For any G , 

0t(C/G) = ^l + ^nnnJStabG(p)|). 

For the perfect cipher U, which is uniformly distributed over the entire symmetric 
group, we can obtain precise formulas for V(,{U) and 6i(U). 



Proposition 3. 



MU) 






Proof. For any tuple p G the stabilizer subgroup of p in is the sym- 

metric group on the remaining messages .y^ — {p}. Each stabilizer therefore has 
(jy£'j—£)! elements, and we may apply Corollaryjto obtain 9i{U). Furthermore, 
we have 



vt{u)= Y. w{u\c^,p^ = p)pp{p) = etiu) Y pp{p) = 0i{u). 



□ 



What Proposition H tells us is that we can determine exactly the expected 
performance of the optimal known and chosen plaintext attacks kpa ^ and cpa ^ 
against a perfect cipher. Provided £ <C |^|, these attacks reduce to very long 
brute-force searches. The addition of a new plaintext-ciphertext pair reduces the 
size, in bits, of the effective search space by 



log6e{U) - log 0^+1 (17) 



{\y^\-£-l)\_ 



< log \Ji\. 



Thus, for a cipher of block length n, \.M\ = 2" and each new plaintext-ciphertext 
pair reduces the search space by no more than n bits. But by Stirling’s formula 



log0o(t^) « log|6^| « n2" bits. 



In other words, in order to reduce the search space to within a reasonably attack- 
able size, on the order of 2" distinct plaintext-ciphertext pairs must be obtained. 
By that time the adversary has a table of all 2" possible ciphertexts from which 
they can look up any desired target plaintext. One cannot expect a block ci- 
pher to perform better than this. We now explicitly prove what this discussion 
suggests, namely that the perfect cipher is as secure as any cipher. 

Theorem 3. For any G -cipher with G < 

MX) < MU), and 9e(X) < Bi{U). 
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Proof. Since G < StabG(p) < Stab©^(p), so that any distribution on a 

coset of StabG(p) can be thought of as distribution on a coset of H — Stab©^ (p). 
Once again invoking the coset work lemma along with the fact that the uniform 
distribution is majorized by any other distribution (see or ^3), we obtain 

W{X\C^ = c,p^ = p)< = c,P^ = p)= 

The theorem is essentially proved. Pedantically, one should expand out the for- 
mulas for h'i{X) and 0i{X) (see B3)’ 



4.4 The Security Factors and Variation Distance 

TheoremHgave us, in terms of variation distance, tight bounds on the difference 
between guesswork and its maximum value. If we wish variation distance to have 
a deeper security meaning, it is natural to seek similar bounds on conditional 
guesswork. 

Theorem 4. For permutation group G < &^ , let X be a G -cipher with prob- 
ability distribution x. If p G and H = StabG(p), then 

W{X\G\P^ = p)>^-±^-\G\\\x-u\\. 

The proof of Theorem^ which is presented in detail in Q, is essentially based 
on the decomposition of the group algebra MG' into a direct sum of vector spaces 
isomorphic to the smaller group algebra Mi?. This is a special case of a very 
important construction called the induced representation (see Q and ^3)’ 

We may bound 9(,{X) by an expression which is a function of the varia- 
tion distance ||a; — u||, and as in Corollary^ the size of the smallest i-message 
stabilizer. 

Corollary 2. For any G < and any G-cipher X with probability distribu- 
tion X, 

9t{X) >\(l+ min |StabG(p)|) - |G| ||a;- m||. 

2 \ ) 

Proof. By definition, 9i(X) = W{X\G^ , P^ = p), where p minimizes the condi- 
tional guesswork. Writing H = StabG(p), we observe 

9iiX) = W{X\G^,P^ = p) 

>htM.\G\\\x-u\\ 

> ^ min |StabG(p)|) - |G| ||a;-u||, 

which was to be proved. □ 
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Just as the lower bound on of O is vacuously negative unless ||a; — u|| 

is smaller than 1/2, so too the lower bound given in Corollaryjwill be negative 
unless II a; — u|| is sufficiently small. To see when this happens, let H be one of 
the smallest ^-message stabilizers, and rearrange the inequality of the corollary 
as 

0tW>^(l-2[G:i7]||x-n||). (7) 

When represented in this way, the lower bound on Oi{X) becomes meaningful 
only when ||a; — u|| is less than 1/{2\G\H]). It is not terribly surprising that 
resistance to chosen plaintext attacks should come at some measurable cost. 

5 Applications to Iterated Cryptosystems 

5.1 Generalized Markov Ciphers and the Cut-Off Phenomenon 

Under the assumption of subkey independence, an iterated cryptosystem is 
equivalent to the product of finitely many independent and identically dis- 
tributed G-ciphers. The sequence of all such products — as the number of rounds 
r ranges from 0 to oo — defines a random walk on G whose underlying Markov 
chain has many important security properties. 

Formally, let be an infinite sequence of i.i.d. G-ciphers, each with 

probability distribution x{g) — P[Xi = g\. Define the sequence {Zr)^Q of G- 
ciphers hy Zq = 1 and Zr = Xr ■ ■ ■ X 2 X 1 . Applying 1^, we see that the distri- 
bution of Zr is given by an r-fold convolution of x with itself 

P[Zr = g]=x*%g). 

The next result follows from Propositionjbelow. 

Proposition 4. The sequence {Zi) is a Markov chain with state space G. 

This fact allows us to generalize the definition of a Markov cipher given by Lai, 
Massey and Murphy Q. Our motivation is itself a generalization of theirs. The 
idea of a Markov cipher in Q was used to model resistance to Differential Crypt- 
analysis as a function of the number of rounds. Similarly we seek to quantify the 
resistance of an iterated cryptosystem to unknown and chosen plaintext attacks, 
as a function of the number of rounds. 

Definition 6. With x and (Xi) defined as above, the G -cipher Zr = 
Xr ■ ■ ■ X 2 X 1 , is called the (generalized/ Markov cipher generated by r rounds 
ofx, and (Zi) will be called the Markov chain generated by x. 

There are a multitude of different Markov chains resulting from the action of 
{Zi) on various G-sets. The following proposition is proved in 

Proposition 5. Let x be a probability distribution on a finite group G, and 
let (Zi) be the Markov chain generated by x. Ij ‘3L is a G-set, and Yq is an 
independent "SZ-valued random variable, then the sequence {Yi = ZiYo)i>o, is a 
Markov chain on the state space W . The transition matrix is doubly stochastic 
and is completely determined by x. 
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Recent decades have witnessed a renaissance in Markov chain research spear- 
headed by Aldous and Diaconis (see Q, Q) . An important frequent observation 
of this research has been that many random walks on groups and other discrete 
structures exhibit cut-off phenomena, in which there is a rapid transition from 
order to uniform randomness. The phenomena is often quantified in terms the 
variation distance \\x*'^ ~ u||, and in fact 1 — — u|| often follows a profile like 

the one in Fig. J Proofs of cut-off phenomena for special cases abound in the 
literature. They sometimes employ representation theoretic arguments as in Q, 
and they sometimes employ more probabilistic arguments as in fi"!. 

In the next section, we explore how probabilistic arguments can establish the 
non-asymptotic behavior of an iterated cryptosystem. In O some cryptological 



implications of the representation theoretic approach are explored. 



5.2 Strong Uniform Times 



Following P and ^ Chap. 4], we introduce some basic definitions useful in 
making probabilistic arguments for establishing the behavior of — u||. 

Let IN = {1, 2, . . .}, and take to be the set of infinite sequences of elements 
of G. A stopping rule is a function 

t:G^ — >lNU{oo}, 



such that if t{gi,g 2 , . . .) = i, then f(gi, 32 , ■ ■ ) = i whenever g^ = pj, j < i. If Zi 
is a sequence of G-valued random variables, then the IN-valued random variable 
T = t{Z\, Z 2 , . . .) is called a stopping time. In essence, the stopping rule identifies 
the first time that a certain condition is met in a sequence of group elements, 
and the stopping time describes random fluctuations in the first occurrence of 
that condition. 

Of course, we are interested in the evolution of Markov ciphers and their 
approach to uniformity. Let Z^ be the Markov cipher generated by r rounds of 
X. If the condition being met by a stopping time T is sufficient to guarantee 
uniformity of Z^, in other words if 



P[Zr = g\T<r] 



I 

M’ 



for all g G G, 



then T is called a strong uniform time {for x). As one might intuitively ex- 
pect, the statistics of a strong uniform time can characterize the approach to 
uniformity of the Markov cipher. 



Lemma 2 (Aldous, Diaconis [y]). Let x be a probability distribution on a 
finite group G, and let T be a strong uniform time for x. Then 



u|| < P [T > r] , for all r > 0. 
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5.3 An Example: Top-to-Random Shuffle 

Consider shuffling a deck of k cards by repeatedly removing the top card and 
returning it to a random position in the deck. Each step can be modeled by 
choosing a random permutation in &k of the form 7 ^ = (i . . . 21 ) with probability 
= ^/k, 1 < i < k. There is a strong uniform time for x defined in the 
following way. Let ti be a stopping rule expressing the first time that 7 ^ is 
chosen. At this point the specific card j has been moved from the top to the 
bottom of the deck. Let be the first time after ti that j returns to the top 
of the deck. At this point each permutation of the remaining cards is equally 
likely. At tk = tfc-i + 1, in other words after the j on top of the deck is placed at 
random within the deck, each permutation of the deck is equally likely. If (Zi) 
is the Markov chain generated by x, then the random time T = tk{Zi, Z 2 , . . .) is 
a strong uniform time for x. 

Aldous and Diaconis show in Q that the probability P [T > r] is governed 
by the “coupon-collector’s problem” and is bounded by 

P [r > fc log k + ck] < e“°, c > 0, fc > 0. 

Thus we have a cut-off point tq = k log k, and 

r>ro. 

The simplicity of the top-to-random shuffle allows it to be implemented as an 
iterated cryptosystem. Consider the shuffle permutations acting on the fc = 2" 
bit strings of length n. By considering these bit strings as binary representations 
of the integers { 0 ,..., 2 ” — 1 }, the following algorithm implements one round of 
the shuffle. 

Algorithm 4. Defines function TopToRand(n, i), which implements one round 
of the top-to-random shuffle. We assume the existence of a pseudo-random num- 
ber generator (PRNG) satisfying 1 < random(n) < n, and uniformly distributed 
thereon. 

input: The block length n, and the plaintext input, represented as 
an integer 0 < i < 2”. 

output: The ciphertext output. 

function TopToRand(n, i).- 
m <— random( 2 "). 
if i < m then 

return i — 1 mod m. 
else 

return i. 
endif 

Unfortunately, the previous algorithm does not achieve security within a 
practical number of rounds because for a reasonable block length, the cut-off 
point ro = n2" is too large. Nevertheless, the example shows that there exists a 
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cipher with an efhcient round function, and for which an explicit cut-off point 
can be computed. Furthermore, using Q and Q, we can also compute explicit 
bounds on the guesswork and chosen plaintext security factor. For r > tq, 



on I 

W{Zr) > — 



1 - , 



and 



0e{Zr) > 



( 2 " -£)! 
2 



1 



2n^-|-lg-(r-ro)/2" 



The lower bound on 9i{Zr) makes use of the fact that all ^-message stabilizers 
of © 2 " have size (2” — £}l. As soon as the quantities in brackets are positive, 
the lower bounds on W{Zr) and 6t{Zr) grow quickly toward intractably large 
quantities, forcing even the most endowed adversary to work “forever” guessing 
the key. 



6 Conclusion 

We have successfully demonstrated that inequalities involving guesswork, con- 
ditional guesswork and variation distance can be used to establish the number 
of rounds necessary to achieve provable security in an iterated cryptosystem. 
Though in the example given here, that number of rounds grows exponentially 
with the block length, the iterations could still be applied to a smaller message 
space to produce provably secure S-boxes. Ongoing research suggests that iter- 
ated cryptosystems exist in which the round function is computationally efficient 
and number of rounds required for provable security is a polynomial in the block 
length. Some caveats to this approach include: 

— We assume the existence of a cryptographically strong pseudo-random func- 
tion. To date such functions are based on hard open problems and bounded 
computational resources. 

— Variation distance is relatively sensitive to small deviations away from unifor- 
mity. It may therefore prove to be overly conservative as a security measure. 

— Direct application of these techniques to existing block ciphers such as DES 
is not expected to be fruitful because it is known that keys of nonzero prob- 
ability are sparse in a large group. Furthermore, it took several decades of 
open research to establish (finally in [J) the precise group for DES. Nev- 
ertheless, in the design of new ciphers, the group G is easily treated as a 
design parameter. Large candidate groups which are smaller than © 2 » in- 
clude various wreath products. 
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Abstract. In this paper we present a model for the bias values asso- 
ciated with linear characteristics of substitution-permutation networks 
(SPN’s). The first iteration of the model is based on our observation 
that for sufficiently large s-boxes, the best linear characteristic usually 
involves one active s-box per round. We obtain a result which allows us 
to compute an upper bound on the probability that linear cryptanalysis 
using such a characteristic is feasible, as a function of the number of 
rounds. We then generalize this result, upper bounding the probability 
that linear cryptanalysis is feasible when any linear characteristic may 
be used (no restriction on the number of active s-boxes). The work of 
this paper indicates that the basic SPN structure provides good secu- 
rity against linear cryptanalysis based on linear characteristics after a 
reasonably small number of rounds. 



1 Introduction 

A substitution-permutation network (SPN) is a basic cryptosystem architecture 
which implements Shannon’s principles of “confusion” and “diffusion” 
which was first proposed by Feistel Q. An SPN is in some sense the simplest 
implementation of Shannon’s principles. Its basic structural elements of substitu- 
tion and linear transformation are the foundation of many modern block ciphers, 
as can be seen from the current AES candidates (for example. Serpent uses a 
straight SPN structure B). Viewing the basic SPN architecture as a “canon- 
ical” cryptosystem has provided a useful model for study, yielding a range of 
analytical and experimental results 

In this paper we consider the linear cryptanalysis of SPN’s, developing a 
model which allows us to bound the probability that a linear attack based on 
linear characteristics will succeed. The result is of interest because, in prac- 
tice, linear cryptanalysis often relies on carefully chosen linear characteristics. 
It should be noted, however, that to achieve “provable security” against lin- 
ear cryptanalysis, resistance to linear hulls, the counterpart of differentials in 
differential cryptanalysis, must be demonstrated (see Nyberg ^3). 

Howard Heys and Carlisle Adams (Eds.): SAC’99, LNCS 1758, pp. 78-^^2000. 
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2 Substitution-Permutation Networks 

A substitution-permutation network processes an TV-bit plaintext through a se- 
ries of R rounds, each round consisting of a substitution stage followed by a 
permutation stage. In the substitution stage, the current block is viewed as M 
n-bit subblocks, each of which is fed into a bijective n x n substitution box 
(s-box), i.e., a bijective function mapping {0, 1}" — > {0, 1}". This is followed by 
a permutation stage, originally a bit-wise permutation, but more generally an 
invertible linear transformation The permutation stage is usually omitted 
from the last round. An example of an SPN with N = 16, M = n = 4, and R = 3 
is shown in FigureJ Incorporation of key bits typically involves the derivation 



plaintext 



round 1 



round 2 



round 3 




ciphertext 



Fig. 1. Example SPN with N = 16, M = n = 4, R = 3 



of (i?-|- 1) N-hit subkeys, denoted K^, K^, . . . , K^, from the original key, 

K, via a key scheduling algorithm. Subkey K’' is XOR’d with the current block 
before round r, and subkey is XOR’d with the output of the last round 

to form the ciphertext. For the purpose of what follows, we will assume that K 
is an independent key a concatenation of {R + 1) N-hit subkeys which are 
not necessarily derivable from some master key via a key-scheduling algorithm 
(therefore K G {0, 

Decryption is accomplished by running the SPN “backwards,” reversing the 
order of the rounds, and in each round performing the inverse linear transforma- 
tion followed by application of the inverse s-boxes (subkey is first XOR’d 
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with the ciphertext, and each subkey K’' is XOR’d with the current block after 
decryption round r). 

For the purpose of this paper, we adopt an SPN structure with M = n 
s-boxes of size n x n (n > 2) in each round (therefore N = •n?), and an inter- 
round permutation tt : {0, — > {0, 1}'^ which connects output bit j of s-box 

i in round r to input bit i of s-box j in round r -f 1 as in Figure | (We use 
the convention that all numbering proceeds from left to right, beginning at 1.) 

In our model, each s-box in the SPN is chosen uniformly and independently 
from the set of all bijective n x n s-boxes. In addition, we make the assump- 
tion that the input to each encryption round is uniformly and independently 
distributed over {0, 1}^. This allows the use of Matsui’s Piling-up Lemma B in 
Sections^LndH This assumption does in fact hold if we observe the inputs to the 
various rounds while varying over all plaintexts and all keys K G {0, 

However, in practice, K is fixed and only the plaintexts vary. Nyberg Q pro- 
vides a rigorous analysis of this issue. 



3 Nonlinearity Measures 



The linear approximation table (LAT) Q of an s-box S : {0, 1}" — > {0, 1}"" is 
defined as follows: for a,/3 S {0, 1}”, 

LAT[a, /3] = # {X e {0, 1}" : a • X = /3 . S'(X)} - 2"-^ , 



where • is the inner product summed over GF{2). It follows that LAT [0,0] = 
2"“^, and if S is bijective, as is the case in an SPN, then LAT[a,/3] = 0 if 
exactly one of a, /3 is 0. These entries are of no cryptographic significance; in the 
discussion below we consider only LAT[a,/3] for a,/3 0. 

If LAT[a,/3] = 0, then the function // 3 (X) = j3 • SfX.) is at equal Hamming 
distance (2”“^) from the affine functions 5 a (N) = a»X and = (a*X)0l 

(here we view functions mapping {0, 1}" — > {0, 1} as 2"-bit vectors for the 
purpose of measuring Hamming distance). A positive value of LAT [a, /3] indicates 
that fj3 is closer to ga, and a negative value indicates that is closer to g'^. In 
fact fff can be approximated by ga with probability 

_ # {X € {0, 1}" : gg(X) = f^jX)} _ LAT[g, /3] + 2"~i 
Poi,f) „„ I V J 



and by g'a with probability (1 — Pa,/ 3 ), computed over the uniform distribution 
of X S {0, 1}". It is also useful to define the bias associated with LAT[a, /3j: 



, 1 LAT[a,/3] 

ba,P — Pa,P ~ 2 “ 2 ^* 



( 2 ) 



Obviously ba ,/3 G 5 ] ■ ^ value of ba ,/3 which is (relatively) large in absolute 
value indicates a (relatively) high probability of success in approximating fp 
by an affine function. It is such approximations that are exploited by linear 
cryptanalysis (Section^. (Conversely, a bias value of 0 yields no information to 
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the cryptanalyst.) One measure of the resistance of an s-box S to approximation 
by affine functions is the nonlinearity of S, NL(iS'): 

NL(S') =2"-i -max{|LAT[a,/3]| : a,/3s {0,1}", a,/3yf 0}. 

Another useful value is the minimum nonlinearity over the s-boxes of the SPN, 

NLmin: 

NL„u„ = min{NL(S') : S G SPN} . 



4 Linear Cryptanalysis of SPN’s 



Linear cryptanalysis is a known-plaintext attack (ciphertext-only under certain 
conditions) which was introduced by Matsui in 1993 Q. Matsui demonstrated 
that linear cryptanalysis could break DES using 2^^ known plaintexts Here 
we present linear cryptanalysis in the context of SPN’s where (absent the com- 
plexities of DES) the basic concepts are more easily stated. Linear cryptanal- 
ysis of SPN’s has been considered to some extent by, among others, Heys and 
Tavares ^ and Youssef 

The basic linear attack attempts to extract the equivalent of one key bit, 
expressed as the XOR sum of a subset of key bits, using a linear equation which 
relates subsets of the plaintext (P), ciphertext(C) and key (K) bits: 



Pq 0 Pi2 © • • • © Pi„ © Cji © Cj, © • • • © Cj, = Kfc, © Kfc, © • • • © Kfc^ (3) 

(here P^j, for example, denotes the bit of P, numbering from left to right). 
Such an equation holds with some probability p (and associated bias b = p — 
computed over the uniform distribution of plaintexts. Matsui’s Algorithm 1 Q 
extracts the key bit represented by the right-hand side of (with success rate 
97.7%) by encrypting Afr random plaintexts, where 



A7l = 



1 



(4) 



(increasing (reducing) the number of random plaintexts encrypted increases (re- 
duces) the probability that the key bit will be determined correctly) . 



One-Round Linear Characteristics 

A system linear approximation such as can be constructed from one-round 
linear approximations, also known as linear characteristics Q. Specifically, a 
one-round characteristic for round r is a tuple 

a = (T^, T5, br) , (5) 

where G {0, 1}^ and G (0, contains in its r**' A^-bit 

subblock, and is zero elsewhere; and bias br G [— 5 , ^] . Let S"'(-) denote appli- 
cation of the round r s-boxes, which are indexed left to right as S 2 , ■ ■ ■, 5)). 



82 



Liam Keliher, Henk Meijer, and Stafford Tavares 



We can view Fp (I^) as an input (output) mask for round r, and specifically 
as the concatenation of n n-bit input (output) masks for the , denoted Fp 



-pr 

1 p 2, . . . 

nonzero 



pr 



/ pr pr 

c.v ^ c, 2 ’ ■ 



Fq „). For 1 < z < n, if both Fp 



and Fq j are 



then SI is called an active s-box If the active s-boxes in round r 
are S'[^, and is the bias associated with LAT [Fp F^ -^] for 

S\ (1 < a < A), then 



A 

b, = 2^-1 n bl (6) 

a—1 

by Matsui’s Piling-up Lemma Q. Note that \br\ < |6[ I for 1 < a < A. 

It follows from the above, and from equations 0 and Q, that for any 
independent key K G {0, and uniformly chosen X G {0, 1}'^, 

Prob {(F^ • X) © (F^ . 5" (X © K")) = (F^ . K)} = K + ^ . 

Multi-round Linear Characteristics 

Given T one-round characteristics Q 2 , ■ ■ ■, satisfying tt (Fq) = Fp'^^ for 
1 < t < (T— 1) (recall that 7t(-) is the inter-round permutation), a single F-round 
characteristic may be formed from their concatenation: 

f2 = {Fi, F^, Fk, b) , 

where Fk = Fj[ © F^ © • • • © Fj^, and 

b=2^-^l[bt (7) 

t=i 

(again from the Piling-up Lemma). If F = F, and if F^ is derived from Fk by 
setting the {R + 1)®* (i.e., last) TV-bit subblock of Fk to F(?, then the linear 
equation represented by 17' = (Fp, Fq , F^^, 6), namely 

F]3. . P © F(f . C = F(!. . K , (8) 

has the form of Q (holding with probability p = 6 + i, over the uniform distri- 
bution of plaintexts, P). 

In order to break DES, Matsui used auxiliary techniques which allowed a 
single linear characteristic to be used for the extraction of more than one key 
bit Since such techniques are not relevant to the discussion which follows, 

we do not present them here. 



5 Model for Distribution of Biases 

For the purpose of linear cryptanalysis, clearly the attacker is interested in the 
F-round linear characteristic whose accompanying bias is maximum in absolute 
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value, termed the best linear characteristic (such a characteristic is not necessarily 
unique), since it minimizes A/"l (see Q). (For the auxiliary techniques mentioned 
in Section J and applied to SPN’s, {R— g)-round characteristics are used, for 
certain integers 9 > 1). We limit our consideration to linear characteristics which 
activate one or more s-boxes in each round, since this is a necessary condition for 
the accompanying bias (computed using the Piling-up Lemma) to be nonzero. 
Note that this condition need not be enforced for linear characteristics of ciphers 
based on the Feistel network architecture, such as DES. 

Let Cr be the set of all i?-round linear characteristics. For a fixed SPN, i.e., 
for a fixed set of s-boxes, and for a given 17 e £r, let 6(17) be the bias associated 
with 17. Define 



Br = max{|6(l7)| : 17 G Cr} . 

Clearly Br G [O, . In addition, let be the set of all i?-round linear charac- 

teristics which activate a total of A s-boxes {A> R), and define 

B^ — max {|6(17)| : 17 G £^} . 

5.1 Modeling Biases of Characteristics in 

We began our research by creating a computer program to search for the best 
i?-round linear characteristic of a given SPN, for varying values of n and R. The 
program, tailored to the SPN structure, is based on Matsui’s algorithm which 
finds the best linear characteristic of DES We quickly observed that the 
best characteristic almost always involved one active s-box in each round (i.e., it 
belonged to £§), especially as the s-box dimension was increased. In fact, when 
500 16-round SPN’s with 8x8 s-boxes were generated at random, the best linear 
characteristic for the first r rounds, 1 < r < 16, was always found to be in £^. 

This is not fully intuitive — increasing the number of active boxes in a given 
round allows the search algorithm more choices for the input mask to the next 
round, potentially increasing the absolute value of the bias associated with that 
round; but it also decreases the absolute value of the bias for the round having 
multiple active s-boxes, by increasing the number of terms in the product of the 
Piling-up Lemma (see O)- 

In this section, in keeping with the above observation, we derive information 
about the distribution of values of B^. We begin with the following result 

Lemma 1. Let S be a bijective n x n s-box, n>2, and let a,/3 G {0, 1}", with 
a,P 0. Then the set of possible values for the bias associated with LAT[a,/3] 
is 

f ±2£ 

< : £ an integer, 0 < £ < 2"“^ 

where the biases each occur with probability 
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computed over the uniform distribution of bijective n x n s-boxes. 

The probability distribution given by Lemmajfor n = 8 is plotted in Figure^ 
(using a log^o scale on the vertical axis). 




Fig. 2. Probability distribution for bias value of single LAT entry (8x8 bijective 
s-box) 



Before proceeding to the next lemma, it is useful to define the following two sets, 
for R>1 and n>2: 

2^-2} 

e for 1 < r < i?} . 



Lemma 2. Let fl G Then the set of possible nonzero values for b{Q) is 



±- 



hGU^ 



-2(n-2)fi+l 

where the biases d= 2 (T.- 2 )ft+i each occur with probability 

R ( 2"- 1 

2^-1 ^ JJ" V2>— 2+^,.; 



11 r 2- \ 

1,^2,... r=l t2"-V 



(9) 



(10) 



computed over all R-round SPN’s with s-boxes chosen uniformly and indepen- 
dently from the set of bijective n x n s-boxes. The probability that b{f2) = 0 is 
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given by 



( 






n 



yii,... t=l 





Proof. Each f2 G represents a “chain” of active s-boxes through the SPN. 
Since 17 uniquely determines the n-bit input/output masks for each active s-box, 
the bias associated with each s-box is determined by a single LAT entry. Let b^ 
be the bias associated with the active s-box in round r, for 1 < r < i?. It follows 
that br takes on the values (with the corresponding probabilities) of LemmaJ 
Now suppose that 6(17) ^ 0 (therefore each 6 ^ yf 0). Let s = s(l7) be the 
number of active s-boxes whose bias is negative. Then from 6(17) is of the 
form 

2^-' n ( 

r— 1 

= ( 

= ( 

which gives Q. 

Consider the case that 6(17) is positive, i.e., 6(17) = ^(„_ 2 )ft+i for some h G 
(so s(l7) is even). We have |6r.| = for some ir € Hn (1 < r < R), with 

= h. Keeping the ir fixed, there are ways of assigning +/— 

signs to the br such that s(l7) is even. For any such assignment, it follows from 
Lemmanthat the probability that the active s-boxes will yield the sequence of 
biases 6 i, 62 , . . . , 6 ^ is 

R ( 2"-i 

( 2" 'I ■ 

r=l 

Summing over all ^i, ^ 2 , ■ ■ ■ ,(-r such that ^ 1^2 ■ ■ ■ Ir = h, we get It is easy 
to see that the case 6(17) = occurs with the same probability. 

The proof in the case 6(17) =0 is based on the observation that a sequence 
of bias values 61 , 62 , . . . , 6 ^ whose product is 0 must consist of T nonzero values 
(and {R — T) zero values), for some T, with 0 < T < (i? — 1). The details are 
omitted here. 



- 1 )" 2 



R-l 



21 . 



% 

2 " 






, for some i\,. . . , G 



- 1 ) 



^1^2 ■ ■ - ^R 



2(n-2)fl+l 

1)‘ '■ = ^2 



Lemma 3. (2" — I)^. 

Proof. The number of ways to choose one active s-box per round is n^. For 
a given choice of active s-boxes, the n-bit output mask for the active s-box in 
round r, 1 < r < (i? — 1 ), is determined: it consists of all zeros with a 1 in 
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position j, where j is the index of the active s-box in round r + 1. Similarly, the 
n-bit input masks for the active s-boxes in rounds 2 ... R are determined. All 
that remains is the choice of input mask for the active s-box in round 1, and the 
output mask for the active s-box in round R. Since each such n-bit mask must 
be nonzero, we have (2” — 1)^ choices, and the result follows. 

The main result of this section is given in Theorem | First, however, it is 
useful to have the following intermediate result. For 17 e Ci and A G (O, i] , 
define 

PrW = Prob{|6(l7)| > A}, 

computed over all i?-round SPN’s with s-boxes chosen uniformly and indepen- 
dently from the set of bijective nxn s-boxes. Arguing as in the proof of Lemmafl 
it can be shown that p^(A) is independent of the choice of 17 G 

Lemma 4. Let A G (O, ^], and define A = X - Then 



p«(A) = 2« ^ 



h>A 



R 



E 



n 



( 2"- 1 \ 

( 2 r-o 



Proof. Let 17 G £§. Then p§(A) = Prob{|6(l7)| > A} = 2 • Prob{5(l7) > A}, 
since the distribution of probabilities corresponding to the possible values of 
6(17) is symmetric about 0, by LemmaH Therefore, we can assume that 6(17) is 
positive, and write 6(17) = 2 (n- 2 )a+i > for some h G Tt^ - ff follows that 



pg(A) = 2-Prob{6(f2) > A} 
= 2-Prob|6> 

= 2 • Prob {h> A} 



= 2 « ^ 



h>A 



E n 

£l---£R — h 



R ( 2—1 y 



( 2 r -0 



where 



follows from 



in Lemmal 



( 11 ) 



Theorem 1. Consider an R-round SPN with nxn s-boxes, n per round, and 
assume that each s-box is chosen uniformly and independently from the set of 
all bijective nxn s-boxes. Let the inter-round permutation, 7t(-), be as above. Lf 
A G (O, , and A = X ■ then 



Prob {B^ > A} < (2n)^ (2" - 1)^ ^ 



h>A 



G... 



E n 

/rGHji — 
■■iR—h 



( 2 r-o 



( 12 ) 
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Proof. We have 

Prob {B^ > a} = Prob {3 17 S such that \b{f2)\ > A} 



^ Prob{|5(f2)|>A} 


(13) 






#^^4(A) , 


(14) 



where follows from ^3 because the distribution of 5(17) is independent of 
the choice of 17. Substituting the results from LemmaH^md Lemmajinto ^3 
and combining constant terms gives ^3i finishing the proof. 



5.2 An Improved Result 

Since the initial submission of this paper, we have been able to generalize the 
main result (Theorem^. The improved result, given in Theorem^below, upper 
bounds the probability that linear cryptanalysis of an SPN using linear char- 
acteristics is feasible, with no restriction on the number of active s-boxes (of 
course, we still require a minimum of one active s-box per round) . 

Theorem 2. Consider an R-round SPN with n x n s-boxes, n per round, and 
assume that each s-hox is chosen uniformly and independently from the set of 
all bijective n x n s-boxes. Let the inter-round permutation, 7t(-), be as above. If 
A e (O, , and A = X- then 



Prob {Bji > A} 



< 



nR 



■ prW ■ 

A=R 



(15) 



Comment on Proof and Computation 

Arguing as in the proof of Theorem 3 it follows that each term in the sum of 
^3 of the form • P^(A) is an upper bound on the probability that linear 
cryptanalysis is feasible using characteristics from C^. Therefore the sum of all 
such terms is an upper bound on the probability that linear cryptanalysis using 
any fl £ Cr is feasible. 

In order to extract useful values from 1^3 1 if is necessary to transform the 
right-hand side into an expression which can be evaluated. The sub-terms 
can be evaluated using a slightly modified version of Lemma3 The main work 
lies in computing the sub-terms ffC-R. We solved this in a recursive fashion. 
The key observation is that if 17 S activates a s-boxes in round R, then 
the sub-characteristic 17' obtained by removing round R is an element of 
Counting the number of ways that an R^^ round with a active s-boxes can be 
added to 17', and summing over all values of a, completes the computation. The 
details can be found in the full version of this paper. 
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6 Computational Results 

The second computer program created to carry out this research computes the 
distribution of biases associated with an ii-round characteristic fl G as given 
by Lemma^ The program works iteratively, computing the distribution for a 
given round r before proceeding to round (r + 1). For n = 8 and R = 3, the 
resulting distribution has 28451 bias values. These are plotted in Figure^ using 
a logio scale for the y-axis. 

0 



-50 

3 -100 

I -150 

CJD 

_o 

-200 



-250 

-0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 

bias value 

Fig. 3. Distribution of 6(17) for 17 G with n = 8, R = 3 




Linear cryptanalysis of an iV-bit SPN using an i?-round linear characteristic 
17 G is feasible if the number of plaintexts required in the best case is at 
most 2^, the number of plaintexts available, i.e., if 

Ml = — ^ < 2^ ^ B^> 2-^/2 (16) 

- 

(this is for Matsui’s Algorithm 1, with a success rate of 97.7% Q). Setting 
A = 2“^/^, TheoremHgives an upper bound on the probability that Q holds. 
A modified version of the program used to determine the distribution of 6(17) 
above, for 17 G £^, was used to evaluate this upper bound, by computing ^3. 
Results for the case n = 8, N = 64 (so A = 2“^^), and R = 10 . . . 16 are 
presented in the second column of Tabled The third column of Table 
the upper bound from the improved result of Theorem^ In addition, the fourth 
column of Tablejgives the experimental probability that ^3 holds, obtained by 
generating 500 16-round, 64-bit SPN’s with random 8x8 s-boxes, and computing 
for the first r rounds, for 10 < r < 16. (In fact we computed Bj{, and found 
that in each case it was identical to B^, as mentioned in Section 3) 
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It is quite interesting that the generalized upper bound of Theorem ^yields 
values which are very close to the values obtained from TheoremJ(in fact, for 
i? > 13 they are identical to four decimal places). This is evidence that the 
first term of the sum in (which is exactly the upper bound of Theorem J 
is the dominant term, supporting our earlier observation that the best linear 
characteristic is usually found in 



Number of 
rounds 


Restricted 
upper bound 

(Theorem^ 


Unrestricted 
upper bound 

(Theorem^ 


Experimental 
(500 trials) 


Ml lower bound 
(older result) 


10 


- 


- 


1.000 


230 


11 


- 


- 


0.962 


233 


12 


4.3823 X 10“® 


4.3824 X 10“® 


0.000 


236 


13 


4.6480 X 10”i° 


4.6480 X 10“i° 


0.000 


239 


14 


8.8615 X 10“^® 


8.8615 X 10“^® 


0.000 


242 


15 


2.8796 X 10“^® 


2.8796 X 10“^® 


0.000 


244 


16 


1.5177 X 10“"‘^ 


1.5177 X 10“"‘^ 


0.000 


247 



Table 1. Probability that linear cryptanalysis is feasible, for n = 8, i? = 10 . . . 16 
(contrasted with lower bound on Ml) 



We have placed the values of Tableau the context of an earlier related result. 
In Heys and Tavares, using the same SPN structure, give an expression which 
provides a lower bound on Ml in terms of NLnUn, namely 



Ml > 



2"-i - NL, 



2R 



(17) 



This is based on the worst-case scenario (from the perspective of the cipher 
designer): the existence of an i?-round linear characteristic in £^, such that the 
absolute value of the bias associated with each active s-box is the maximum 
possible, namely | (2”“^ — NLnUn) /2”|. Evaluating 1^3 in the case n = 8 {N = 
64), with NLniin = 80, gives the values in the rightmost column of Table O 
Taken alone, these lower bounds seem to imply that linear cryptanalysis of at 
least 16 rounds is feasible (in fact, ^3 does not tell us that linear cryptanalysis 
becomes infeasible until R > 22). However, the result of TheoremHshows this to 
be excessively pessimistic — the probability that linear cryptanalysis of an SPN 
is feasible, using any characteristic Q S Cr, is small for i? > 12 (computed over 
all SPN’s, as per our model) . This evidence of resistance to linear cryptanalysis 
is especially interesting when compared to a result of Chen 3, who showed that 
under certain assumptions about the XOR tables of the s-boxes, the same 64-bit 
SPN is also resistant to differential cryptanalysis for R > 12. 
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7 Conclusion 

In this paper we have presented a model for the bias values associated with linear 
characteristics of substitution-permutation networks. We first considered linear 
characteristics which activate one s-box in each round, since experimentally these 
usually provide the best bias value. We determined the distribution of bias values 
which can be associated with such characteristics. This allowed us to evaluate 
an upper bound on the probability that linear cryptanalysis using such linear 
characteristics is feasible, as a function of the number of rounds. This probability 
is computed over all SPN’s with s-boxes chosen uniformly and independently 
from the set of all bijective n x n s-boxes. 

We then gave a generalization of the above result, stating an upper bound on 
the probability that linear cryptanalysis of an SPN is feasible, with no restriction 
on the number of s-boxes activated by the linear characteristics used. Experi- 
mental data indicates that the restricted and the generalized upper bounds yield 
nearly identical values, supporting the observation that the best linear charac- 
teristics almost always activate one s-box per round. 

The work of this paper further supports the idea that the basic SPN structure 
merits study, both as a source of theoretical results, and as a practical cipher 
architecture with good security properties after a relatively small number of 
rounds. 
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Abstract. This paper proves (i) in any (n — l)-dimensional linear sub- 
space, the non-propagative vectors of a function with n variables are 
linearly dependent, (ii) for this function, there exists a non-propagative 
vector in any (n — 2)-dimensional linear subspace and there exist three 
non-propagative vectors in any (n — l)-dimensional linear subspace, ex- 
cept for those functions whose nonlinearity takes special values. 
Keywords: Cryptography, Boolean Function, Propagation, Nonlinear- 
ity. 



1 Introduction 

In examining the nonlinearity properties of a function / with n variables, it is 
important to understand 5i/, the set of so-called non-propagative vectors where 
/ does not satisfy the propagation criterion. In this work, we are concerned with 
both (the number of non-propagative vectors in 5i/) and the distribution 
of 5i/. More specifically, we prove two properties of 3?. One is called the strong 
linear dependence and the other the unbiased distribution, of 3?. 

The strong linear dependence property states that if W is a (n — l)-dimen- 
sional linear subspace satisfying #(3fJn W) > 4, then the non-zero vectors in 
'Si n W are linearly dependent. This improves a previously known result. The 
unbiased distribution property says that any function / with n variables, except 
for those whose nonlinearity takes the special value of 2 ”“^ — 22 0-1)^ 2”“^ — 2 2 "- 
or 2”“^ — 25"“^, fulfills the condition that every {n — 2)-dimensional linear 
subspace contains a non-zero vector in 3?/ and every (n — l)-dimensional linear 
subspace contains at least three non-zero vectors in 3?/. In special cases, #(3? n 
W) may significantly effect other cryptographic properties of a function. The 
strong linear dependence and the unbiased distribution are helpful for the design 
of cryptographic functions as these conclusions provide more information on the 
number and the status of non-propagative vectors in any (n — l)-dimensional 
linear subspace. 
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2 Cryptographic Criteria of Boolean Functions 



We consider functions from W, to GF{2) (or simply functions on V„), Vn is 
the vector space of n tuples of elements from GF{2). The truth table of a 
function / on is a (0, l)-sequence defined by {f{ao), f{ai),..., /(a 2 "-i)), 
and the sequence of / is a (1, — l)-sequence defined by ((— , (— 

where oo = (0,...,0,0), ai = (0,...,0,1), . . = 
(1, . . 1, 1). The matrix of / is a (1, — l)-matrix of order 2" defined by M = 
((_l)/(“i®“j)) where 0 denotes the addition in GF{2). f is said to be balanced 
if its truth table contains an equal number of ones and zeros. 

Given two sequences a = (oi, • • • , am) and b = (6i, • • • , bm), their component- 
wise product is defined by a * 6 = (ai5i, • • • , ambm)- In particular, if m = 2” and 
a, b are the sequences of functions / and g on Vn respectively, then d * b is the 
sequence of / © g where © denotes the addition in GF{2). 

Let a = (ai,---,am) and b = (bi,---,bm) be two sequences or vectors, 
the scalar product of a and b, denoted by (a, 5), is defined as the sum of the 
component-wise multiplications. In particular, when a and b are from Vm, (a, b) = 
aibi © • ■ • © ambm, where the addition and multiplication are over GF(2), and 
when a and b are (1, — l)-sequences, (a, b) = ^i^i, where the addition and 

multiplication are over the reals. 

An affine function / on Vn is a function that takes the form of f{x \, . . . , Xn) = 
a\Xi © • • • © anXn © c, where Oj, c G GF(2), j = 1, 2, . . . , n. Furthermore / is 
called a linear function if c = 0. 

A (1, — l)-matrix N of order n is called a Hadamard matrix if = nin, 

where N'^ is the transpose of N and /„ is the identity matrix of order n. A 
Sylvester-Hadamard matrix of order 2”, denoted by Fin, is generated by the 
following recursive relation 



Ho = 1 , Hn 



Hn—1 Hn—1 
Hn—1 Hn—1 



n = 1, 2, . . .. 



Let 0 < z < 2” — 1, be the i row of iL„. It is known that £i is the sequence 
of a linear function ipi{x) defined by the scalar product (pi(x) = (ai,x), where 
ai is the zth vector in according to the ascending alphabetical order. 

The Hamming weight of a (0, l)-sequence denoted by W{f), is the number 
of ones in the sequence. Given two functions / and g on Vn, the Hamming 
distance d{f, g) between them is defined as the Hamming weight of the truth 
table of f{x) © g{x), where x = (a;i, . . . , a;„). 



Definition 1. The nonlinearity of a function f on Vn, denoted by N f, is the 
minimal Hamming distance between f and all affine functions on Vn, i-e., Nf = 
minj^i 2 2 "+i d{f,ipi) where ip\, ipi, . . ., are all the affine functions on 

F„. 



The following characterisations of nonlinearity will be useful (for a proof see 
for instance Q). 
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Lemma 1. The nonlinearity of f on Vn can be expressed by 

Nf = - imax{|(^,4)|,0 < z < 2" - 1} 

where f is the sequence of f and £q, ■ ■ ■, ^ 2 "-i are the rows of Hn, namely, the 
sequences of linear functions on Vn- 

Definition 2. Let f be a function on Vn- For a vector a € Vn, denote by f{a) 
the sequence of f{x(B a)- Thus ^(0) is the sequence of f itself and ^(0) *^( 0 ) is 
the sequence of f{x) 0 f{x 0 a)- Set 

the scalar product of f{0) and ^(a). 2\(a) is called the auto- correlation of f with 
a shift a- Write 

Am = max{|Z\(a)||a e Vn, a ^ 0} 

We omit the subscript of Af{a) if no confusion occurs. 

Definition 3. Let f be a function on Vn- We say that f satisfies the propagation 
criterion with respect to a if f{x) 0 f{x 0 a) is a balanced function, where 
X = (si, . . - ,Xn) and a is a vector in Vn- Furthermore f is said to satisfy the 
propagation criterion of degree k if it satisfies the propagation criterion with 
respect to every non-zero vector a whose Hamming weight is not larger than k 
(see 

The strict avalanche criterion (SAC) Q is the same as the propagation cri- 
terion of degree one. 

Obviously, A{a) = 0 if and only if /(a;)0/(a;0a) is balanced, i.e., / satisfies 
the propagation criterion with respect to a. 

Definition 4. Let f be a function on Vn- a G Vn is called a linear structure of 
f if |Z\(a)| = 2” (i-C-, f{x) 0 f{x 0 a) is a constant)- 

For any function /, A{ao) = 2", where ao is the zero vector on Vn- It is 
easy to verify that the set of all linear structures of a function / form a linear 
subspace of Vn, whose dimension is called the linearity of f- It is also well-known 
that if / has non-zero linear structures, then there exists a nonsingular n x n 
matrix B over GF{2) such that f{xB) = g{y) 0 h{z), where x = {y, z), y G Vp, 
z G Vq, g is a, function on Vp that has no non-zero linear structures, and h is an 
affine function on Vq- 

The following lemma is the re-statement of a relation proved in Section 2 

ofB. 

Lemma 2. For every function f on Vn, we have 

(A(ao), A{ai ), . . . , Z\(a 2 »-i))i?n = {.{(,, £- 0 )'^ , {£., £ 1 )“^, - - - ,{^, -^ 2 "-i)^)- 

where f denotes the sequence of f and £t is the ith row of Hn, z = 0, 1, . . ., 2" — 1. 

The balance and the nonlinearity are necessary in most cases. The propaga- 
tion or especially the SAC, is an important cryptographic criterion. 
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3 Introduction to 3^? 

Notation 1. Let f be a funetion on Vn- Set 3?/ = {a | A{a) ^ 0, a € Vn}, 
Am = max{|Z\(a)||a G Vn, a ^ 0}. 

We simply write 3?/ as 3? if no confusion occurs. It is easy to verify that 
^3? and Am are invariant under any nonsingular linear transformation on the 
variables, where ^ denotes the cardinal number of a set. 

#3? and the distribution of 3? reflects the propagation characteristics, while 
Am forecasts the avalanche property of the function. Therefore information on 
3? and Am is useful in determining important cryptographic characteristics of 
/. Usually, small #3? and Am are desirable. 

Definition 5. A funetion f on Vn is ealled a bent function ^ if = 2” 

for every z = 0, 1, . . . , 2" — 1, where li is the ith row of Hn- 

A bent function on U„ exists only when n is even, and it achieves the highest 
possible nonlinearity 2"“^ — 25"“^. The algebraic degree of bent functions on 
Vn is at most ^rz From ^ and Parseval’s equation, we have the following: 

Theorem 1. Let f be a funetion on Vn- Then the following statements are 
equivalent: (i) f is bent, (ii) fffi = 1, (Hi) Am = 0, (iv) the nonlinearity of f, 
Nf, satisfies Nf = 2"“^ — 25”“^, (v) the matrix of f is an Hadamard matrix. 

The following result is called the linear dependence theorem that can be found 
in B 

Theorem 2. Let f be a function on Vn that satisfies the propagation criterion 
with respect to all but k + 1 vectors 0, Pi,..., /3k in Vn, where k > 2. Then 
Pi, ..., Pk are linearly dependent, namely, there exist k constants ci, ... ,Ck G 
GF{2), not all of which are zeros, such that ci/3i 0 • • • © CkPk = 0. 

Note that rz+ 1 non-zero vectors in Vn must be linearly dependent. Hence if 
#3? > n + 2 (i.e., #(3?— {0}) > n + 1) then TheoremHis trivial. For this reason, 
we improve Theorem Hin this paper. We prove two properties of 3?: the strong 
linear dependence and the unbiased distribution of 3?. 

4 The Strong Linear Dependence Theorem 

Note the zth (i.e., the oAb) row of Hn, where ai G Vn is the binary representation 
of integer j, j = 0, 1, . . . , 2” — 1, is the sequence of linear function (ppx) = {ai, x). 
Lemma 4 of H oan be restated as follows: 

Lemma 3. Let Q be the 2” x g that consists of the aj.,th, . . . , the aj^th rows 
of Hn, where each aj G U„ is the binary representation of integer j, 0 < j < 
2" — 1. If aj„ , . . ., ctj^ are linearly independent then each (ai, . . . , a^)^, where 
each Oj = ±1, appears as a column in Q precisely 2"“'3 times. 
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The following Lemma can be found in [^ . 

Lemma 4. Let n > 3 be a positive integer and 2 " = where oi > 02 > 

03 > 04 > 0 and each Oj is an integer. We have the following statements: 

(i) if n is add, then a{ = a\ = 2"“^, 03 = 04 = 0, 

(ii) if n is even, then a{ = 2", 02 = 03 = 04 = 0 or o^ = o^ = o§ = 04 = 2"“^. 

Lemma 5. For every function f on Vn, we have 
2(Z\(ao)) ^(0^2)5 ■ ■ ■ J ^{o'2’^-2))Hn-l 

= ((C) -^o)^ + (C) -^l)^) (C) -^ 2 )^ + (C) 4)^, ■ ■ • , (C) -^2"-2)^ + (C) •^2’*-l)^) 

where f denotes the sequence of f and ii is the ith row of Fin, i = 0, 1, . . 2" — 1. 
Proof. From LemmaO 

2"(Z\(ao), Li(ai), ■ • ■ , Zi(a2-i)) = ((C, 4 )", (C, ^l)^ • ■ ■ , (C, (1) 

Comparing the 0 th, the 2 nd, . . ., the ( 2 ” — 2 )th terms in the two sides of equality 
fl, we obtain 



2 "'(Z\(ao), Z\(of2)) ■ ■ ■ ) L\(of2’»-2)) 

= ((C) 4)' + (C, ^l)^ ((C) ^ 2 ? + (C, 4)^ . . . , (C, ^ 2 - 2 )' + (C, ^2^-i)")i?„-i 

This proves the lemma. □ 

The following theorem is called the strong linearly dependence theorem which 
is an improvement on TheoremH(the linearly dependence theorem). 

Theorem 3. Let f be a function on Vn, and W be a {n— 1 )- dimensional linear 
subspace satisfying ^DW = { 0 , / 3 i, . . . , f 3 k} (k> 3 ). Then / 3 i,...,f 3 k are linearly 
dependent, namely, there exist k constants ci, . . . , Cfc G GF( 2 ) with (ci, . . . , Cfc) yf 
(0, . . . , 0), such that cifdi 0 ■ • ■ 0 Ckfdk = 0. 



Proof. The theorem is obviously true if fc > n. Now we prove the theorem 
for k < n. We only need to prove the lemma in the special case when W is 
composed of ao,a2, ■ ■ .,a2"-2, where a2j G Vn is the binary representation of 
an even number 2 j, j = 0 , 1 ,..., 2 ”“^ — 1 . In other words, W is composed of all 
the vectors in L„, that can be expressed in the form (oi, . . . , a„_i, 0), where each 
Qj G GF( 2 ). In the general case, we can use a nonsingular linear transformation 
on the variables so as to change W into the special case. Let f be the sequence 
of /. 

Since Pj G W, j = 1 , . . . ,k, Pj can be expressed as Pj = (jj, 0 ) where 
G Vn-i, j = l,...,k, and 0 G GF( 2 ). 
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Let P be a (fc + 1) X 2"“^ matrix composed of the 0th, the 71 th, . . the 
7 fcth rows of Hn-i- Set = (^, ij^, j = 0, 1, . . . , 2” — 1. Note that A{a) = 0 if 
a ^ {0, Pi, , Pk}. Hence the equality in Lemmajcan be specialized as 

2(Z\(0), A{Pi), ..., A{Pk))P = {al + a^al + al,..., + al„_i) ( 2 ) 

where 2\(0) is identical to A{ao) where oq = 0- 

Write P = (pij), i = 0,1, . . .k, j = 0,1, . . . , 2”“^ — 1. As the top row of P is 
( 1 , 1 , . . 1 ), from B, 



k 

2(A(0) + 'y^pijA{Pi)) = a2j + a2j+i (3) 

i=l 

j = 0, 1, . . 2"“^ — 1. Let P* be the submatrix of P obtained by removing the 
top row from P. 

We now prove the theorem by contradiction. Suppose k vectors in Pi, 
. . . , Pk, are linearly independent. Hence k vectors in Vn-i, 71 , . . ., 7 fc, are also 
linearly independent and hence fc < n — 1 . 

Applying Lemma H to matrix P*, we conclude that each fc-dimensional 
(1, — l)-vector appears in P*, as a column vector of P* precisely times. 

Thus for each fixed j there exists a number jo, 0 < jo < 2”“^ — 1, such that 
{pij„, . . .,Pkjo) = -{Pij^ ■ ■ -^Pkj) and hence 

k 

2(^(0) - '^Pij„A{Pi)) = a% + alj^+i (4) 

Adding Q and Q together, we have 4A(0) = a? + 0.2J+1 + + aljo+i- 

Hence a| + a 2 j_|_i + = 2”+^. There are two cases to be considered: 

even n and odd n. 

Case 1: n is odd. By using LemmaJ 

{al a%+i, al,a%^+i} = {2"+\ 2"+\ 0, 0}, j = 0, 1, . . . , 2"-i (5) 

Hence from 1 ^, we have A(0) + '^i^iPijA(Pi) = 2”+^, 2”, 0 and hence 

k 

A(A) = 2", 0, -2", J = 0, 1, . . . , 2" - 1 (6) 

For each fixed j, rewrite Q as 

k 

Pi,Z\(/3i) +^K,Z\(A) = 2", 0,-2" (7) 

i^2 

By using Lemma H there exists a number ji, 0 < ji < 2"“^ — 1, such that 

(Pljl J P'iji J ■ ■ ■ ) Pkji) = {pij , ~P2j , ■ ■ ■, ~Pkj ) • 
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Hence 



Pi,,Z\(/3i) = 2", 0,-2^ 



( 8 ) 



Adding Q and Q together, we have 

Pi,A(/3i) = ±2",±2"-\0 

Since 2\(/?i) ^ 0, we conclude 2\(/3i) = ±2”, ±2"“^. By the same reasoning we 



can prove 



A(/3,) = ±2^±2"-^j = l,2,...,fc 



Thus we can write 

(Z\(/3i),...,Z\(/3fc)) = 2"-i(5i,...,5fc) 



( 9 ) 



( 10 ) 



where each bj = ±1,±2. By using LemmaJ there exists a number s, 0 < s < 
2”“^ — 1, such that 



I . bi 5fc 



( 11 ) 



Due to ^3 and ^ 

k k 



= E = E = 2-1 E 1^*1 > (12) 

1—1 1—1 ' i—l ' i—1 

Since fc > 3, contradicts 

Case 2: n is even. By using Lemma^ 

= {2"+^ 0, 0, 0} or 

ato+ii = {2”> 2^ 2^ 2^},j = 0,1,..., 2-^ (13) 

Hence from fl, we have A(0) + Yli=iPij^iPi) = 2”+i, 2", 0, and hence 

k 

Ek.^(A) = 2",0,-2" 

i=l 

Repeating the same deduction as in Case 1, we obtain a contradiction in 
Case 2. 

Summarizing Cases 1 and 2, we conclude that the assumption that /3i, . . . , /3fc 
are linearly independent is wrong. This proves the theorem. □ 



TheoremHshows that 3? is subject to crucial restrictions. We now compare 
Theorem Jwith Theorem H Since n + 1 non-zero vectors in Vn must be linearly 
dependent, TheoremHis trivial when #3? > n -|- 2 (i.e., #(3? — {0}) > n -I- 1). 
In contrast, in Theorem ^the linear dependence of vectors takes place in each 
3? n W not only in 3?. 

We notice that there exist n — 1 (n— l)-dimensional linear subspaces. Hence 
Theoremjis more profound than Theorem^ 
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5 The Unbiased Distribution of 9? 

In this section we focus on the distribution of 3? for the functions on U , whose 
nonlinearity does not take the special value 2”“^ — or 2”“^ — 25" or 

2"-i - 25"-i. 

The next result is from Q (Theorem 18). 

Lemma 6. Let f be a funetion on Vn (n>2),^ be the sequence of f, and p is 
an integer, 2 < p < n. If = 0 (mod 2”“^+^), where £ j is the jth row of 

Hn, i = Oj Ij ■ • ■ I 2" — 1, then the algebraic degree of f is at most p — 1- 



Lemma 7. For every function f on Vn, we have 

4(Z\(q!o)j 2\{af), . . . , A{a2-^-i))Hn-2 

j—Q j—2'^—4 

Where ^ denotes the sequence of f and £i is the ith row of Hn, i = 0,1, . . 2" — 1. 

Proof. Comparing the 4jth terms, j = 0, 1, . . .,2"“^ — 1, in the two sides of 
equality we obtain 

2”(Z\(ao), A{af), . . ., Z\(o!2"-4)) 

3 7 2"-l 

j=0 j=4 j=2"-4 

This proves the lemma. □ 



Theorem 4. Let f be a function on Vn, and U be a {n — 2) -dimensional linear 
subspace satisfying #(3?n U) = 1 (i.e., 3? n C/ = {0}/ Then we have 

(i) if n is odd, then the nonlinearity of f satisfies Nf = 2”“^ — 2 ^^-'^) f/jg 
algebraic degree of f is at most 25("+^), 

(ii) ifn is even, then f is bent or the nonlinearity of f satisfies Nf = 2"“^ — 25" 
and the algebraic degree of f is at most 25"+^. 



Proof. We only need to prove the theorem in the special case when U is com- 
posed of ao, 04 , 08 , . . . , 02 "- 4 , where 04 j G Vn is the binary representation of 
even number 4j, j = 0,1,2,..., 2"“^ — 1. In other words, U is composed of all 
the vectors in V„, that can be expressed in the form (oi, . . . , a„_ 2 , 0, 0), where 
each Qj G GF(2). For [/ in general case, we can use a nonsingular linear trans- 
formation on the variables so as to change (7 into the special case. Let f be the 
sequence of /. Set Oj = ((, tfY , j = 0, 1, . . . , 2" — 1. 
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Since Z\(0) = 2” and Z\(a4j) = 0, j = 1,2,..., 2” ^ — 1, the equality in 
LemmaHis specialized as 

= "•£ a]) (14) 

j=0 j=4 j=2"-4 

j = 0,1 ,... , 2-2-1. 

(i) When n is odd, by using LemmaH 

{a%,al+„ = {2"+\ 2"+\ 0, 0}, j = 0, 1 , . . ., 2""2 

By using Lemma H we have proved the nonlinearity of / satisfies Nf = 
2 n-i _ 2 i(n-i) ^ and by using LemmaH we have proved that the algebraic degree 
of / is at most 

(ii) When n is even. By using LemmaJ 

al^-+3, = {2", 2", 2", 2"} or {2"+2, 0, 0, 0}, 

j = 0,l,...,2"-2_i. 

If there exists a number jo, 0 < jo < 2'^ ^ — 1, such that 

{d4jg, aljo+i, 04j(j+2j 04^0+3} = {2"“''2, 0, 0, 0} 

then by using Lemma Q we have proved that the nonlinearity of / satisfies 
Nf = 2"“^ — 25", and by using LemmaH we have proved that the algebraic 
degree of / is at most 25("+i). 

If there exists no such jo, mentioned as above, i.e., {a^j, 04^+1, 04^+31 04^+3} = 
{2", 2", 2", 2"}, j = 0, 1, . . . , 2"-2 - 1. Then / is bent. 

□ 

To emphasise the distribution of 3? we modify Theorem ^ as follows: 

Theorem 5. Let f be a function on Vn- If the nonlinearity of f does not take the 
special value 2"" ^ - 25 (—i) or 2—i - 2^" or 2"" ^ - 25—i, then #(3?n C/) > 2 
where U is any {n— 2) -dimensional linear subspace, in other words, every (n—2)- 
dimensional linear subspace U contains a non-zero vector in 3?. 

There exist many methods to locate all the (n — l)-dimensional linear sub- 
spaces and all the (n — 2)-dimensional linear subspaces in V„- For example, let 
ipa denote the linear function on Vn, where a G Vn, such that (pa(x) = (a,x). 
Hence W = {"f\a G Vn, <Pa{l) = 0} is a (n— l)-dimensional linear subspace and 
each (n — l)-dimensional linear subspace can be expressed in this form. 

Also for any a, a' G Vn with a ^ a' ,U = {'y\a G Vn, <Pa{’l) = 0, = 0} 

is a (n — 2)-dimensional linear subspace and each (n — 2)-dimensional linear 
subspace can be expressed in this form. 

Lemma 8. Let 12 be a subset ofVk with 0 ^ f2. If there exists a positive integer 
p such that ff{f2 C\U) > p holds for every (fc — 1)- dimensional linear subspace 
U, then fff2>2p-\-l- 
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Proof. Note that each non-zero vector is included in precisely 2^“^ — 1 (fc — 1)- 
dimensional linear subspaces, on the other hand, there exist exactly 2^ — 1 (fc — 
l)-dimensional linear subspaces. Hence (2^“^ — 1)#17 = ^)- From 

PU) > p, we conclude that (2^“^ — V)fff2 > (2^ — l)p. Since > 2, 

fffi>2poT:ffQ>2p+l. □ 

Theorem 6. Let f be a funetion on If the nonlinearity of f does not take the 
special values 2"-i-25("-i) or2""i-25" or 2""i -25"-i, then >4 

for every {n— 1)- dimensional linear subspaee W, in other words, every {n— 1)- 
dimensional linear subspace W contains at least three non-zero vectors in 5ft. 

Proof. Let W be an arbitrary (n — l)-dimensional linear subspace and U be 
an arbitrary (n — 2)-dimensional linear subspace with U C W. Note that the 
inequality in Theorem Hcan be rewritten as 

#((5ft-{0})nl7)>l (15) 

and ((5ft — {0}) n W) n C/ = (5ft — {0}) n U. Applying Lemma^ we have proved 
#((5ft- {0}) n VL) > 3. Since 0 G 5ftn VL, #(5ftn VL) >4. □ 

Theorems ^andj are helpful to locate the non-propagative vectors. 

The properties mentioned together in Theorems fland^are called the unbi- 
ased distribution of 5ft, with respect to every (n — 2)-dimensional linear subspace 
and every (n — l)-dimensional linear subspace. 

6 Distribution of 3? in Special Cases 

We now turn to the case #(5ft/ n W) < 3 where W is an (n — l)-dimensional 
linear subspace. The following Lemma can be found in 

Lemma 9. Let n > 2 be a positive integer and 2^^ = a? -\- b^ where a > b > 0 
and both a and b are integers. Then a? = 2" and 5 = 0 when n is even, and 
of = h^ = 2"“^ when n is odd. 

Theorem 7. Let f be a function on Vn, and W be an {n — 1)- dimensional linear 
subspace satisfying #(5ft H W) = 1 (i.e., = {0}/ We have 

(i) f has at most one non- zero linear structure, 

(a) if n is odd, then the nonlinearity of f satisfies Nf = 2"“^ — 25*^"“^) and the 
algebraic degree of f is at most 
(Hi) if n is even, then f is bent. 

Proof, (i) Let a* G and a* ^ W, From linear algebra, Vn = W U (a* 0 W), 
where a*®W = {a*®a\a G W}, W and a*®W are disjoint. We now prove that 
/ has at most one non-zero linear structure by contradiction. Suppose / has two 
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non-zero linear structures, /3i and (32 with /3i ^ /? 2 - Since all linear structures of / 
form a linear subspace of Vn, /?i 0/32 is also a non-zero linear structures of / and 
hence /?i 0/32 G 3?. Since 3?n Vb = {0}, /3i, /32 G a* 0 W. Obviously /3i 0/32 G 
and hence /3i 0 /32 G 3? n This contradicts the condition 3? n Vb = {0}. The 
contradiction proves that / has at most one non-zero linear structure. 

Recall the proof of Theorem^ Q can be specialized as 2Z\(0) = Talj-i-i 
and hence a^j + ~ 2”+^, where j = 0, 1, . . . , 2”“^ — 1. 

(ii) If n be odd, from LemmaJ { 0}, where j = 0, 1, . . ., 

2”“^ — 1. From Lemma^ the nonlinearity of / satisfies Nf = 2 ”“^ — 25 ("-i), gy 
using LemmaHwe conclude that the algebraic degree of / is at most 

(iii) If n is even, due to LemmaH a^j = = 2", where j = 0, 1, . . . , 2"“^ — 

1. This proves that / is bent. 

□ 

Example 1. Let n be a positive odd number and 
Xn) = 2;i 0 g{x 2 , ■ ■ ■ ,Xn) where g is a bent function in I4,_i. Let W be an 
(n — l)-dimensional linear subspace of V„, composed of all the vectors in Vn, 
that can be expressed in the form (0, 02 , . . . , a„), where each aj G GF(2). It 
is easy to see a* = (I,0,...,0) G Vn is a non-zero linear structure of / and 
3? n IF = {0}. Due to (ii) of Theorem|| Nf = 2""i - 25("-i). 

We can restate (iii) of Theorem Has follows: 

Proposition 1. Let f he a function on Vn where n is even. If there exists an 
{n—l)~ dimensional linear subspace Wq satisfying #(3?n Wq) = 1 (i.e., 3?n Wq = 
{0}^, then f satisfies iROW = {0}, for every {n — l)~ dimensional linear subspace 

W. 



Next we examine the case of #(3? n W) = 2. 

Theorem 8. Let f be a function on Vn- If there exists a {n — 1)- dimensional 
linear subspace W satisfying 3fin W = {0,/3i}, then we have 

(i) /3i is a non-zero linear structure of f, 

(ii) if n is odd, then the nonlinearity of f satisfies Nf = 2"“^ — 22 *^"“^) and the 
algebraic degree of f is at most 

(iii) if n is even, then Nf = 2"“^ — 2^" and the algebraic degree of f is at most 
25"+i. 



Proof. Since any single non-zero vector is linearly independent, we can keep the 
deduction in the proof of Theorem H until inequality where we need the 
condition fc > 3. 

(i) Recall the proof of Theorem H H can be specialized as pijA{Pi) = 
2", 0, -2", j = 0, 1, . . . , 2" - 1. Since /3i G 3?, Z\(/3i) ^ 0. Hence Z\(/3i) = ±2". 
This proves that /3i is a non-zero linear structure. 
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(ii) If n is odd, from Q we conclude that (^, 4)^ = 2"+^, 0, z = 0, 1, . . 2”— 1, 

and hence by using LemmaJ we have proved Nf = 2"“^ — By using 

LemmaOwe conclude that the algebraic degree of / is at most 25("+^). 

(iii) If n is even, from 1^3’ = 2"+^, 0,2". Since #3? > 1, / is not 

bent. Hence = 2" cannot hold for all i and hence there exists a number 

io, 0 < zo < 2” — 1, such that 4)^ = 2”+^. By using LemmaH we have proved 
Nf = 2"-1-25", if n is even. By using LemmaHwe conclude that the algebraic 
degree of / is at most 25"+i. 

□ 

Example 2. Let rz be a positive odd number and /(a;i, . . . , Xn) be the same with 
that in Example^ Let W be an (rz — l)-dimensional linear subspace of 14,, com- 
posed of all the vectors in 14, that can be expressed in the form (oi, . . . , a„_i, 0), 
where each aj G GF{2). It is easy to see a* = (1,0,..., 0) G 14 is a non- 
zero linear structure of / and 3? n VE = {0,a*}. Due to (ii) of Theorem ^ 
Nf = 2""i - 

Let fc be a positive even number with fc > 4 and h(xi, . . . , Xk) = a;i 0 
X 2 0 q(x 3 , . . . , Xk) where q is a bent function on I4_2. Let U be an (rz — 1)- 
dimensional linear subspace of 14, composed of all the vectors in 14, that can 
be expressed in the form (0,O2, . . - ,ak), where each aj G GF{2). It is easy to 
see a* = (0, 1, 0, . . . , 0) is a non-zero linear structures of h and 3? n 17 = {0, a*}. 
Due to (iii) of Theorem^ Nh = 2^~^ — 2^^. 

It is interesting that by using Theorem J we have determined Nh only from 
the condition #(3finl7) = 2 for an (rz-l)-dimensional linear subspace U although 
we do not search other vectors in 3?. 

Finally, we consider the case when :;(1(3? n VE) = 3. 

Theorem 9. Let f be a function on If there exists a [n — 1)- dimensional 
linear subspace W satisfying 3? n VE = {0, /?i, P 2 }, then the following statements 
hold: 

(i) A{P,) = ±2^-\ j = l,2, 

(ii) if n is odd, then the nonlinearity of f satisfies Nf = 2"“^ — 2^*^"“^) and the 
algebraic degree of f is at most 

(iii) if n is even, then Nf = 2"“^ — 2^" and the algebraic degree of f is at most 
25"+i. 



Proof. Since any two non-zero vectors are linearly independent, we can keep 
the deduction in the proof of Theorem H^ntil inequality ^3 where we need the 
condition fc > 3. 

Recall the proof of Theorem^ ^ can be specialized as A{f3j) = ±2", ±2"“^, 

J = l,2. 

On the other hand, ^3, ^3 can be rewritten as (Z\(/3i), A{(32)) = 

2"“1(6i, 52) where each bj = ±1,±2, {pis,P 2 s) = and 

PisA{(3i) + P2.sA{(32) = (|fci| 0 \b2\)2P ^ 



(16) 
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respectively. It is easy to prove 61,62 = ± 1 . Otherwise, for example, 61 = ± 2 , 
from , pisZ\(/?i)+P2s^(/32) > 3 - 2 ”“^. This contradicts Q. Since 61, 62 = ± 1 , 
Z\(/ 3 i), Z\(/?2) = ± 2 ”“^. This proves (i). 

The rest proof is the same with the proof of Theorem J □ 

Example 3 . Let n be a positive odd number with n> 7 , h{xi, X2, X3, X4, X5) = 
(a;i 0 a;2 © a;3)a;4a;5 ©a;ia;5 © a;2a;4 © xi © ^2 © a;3 and g{xe , . . . , Xn) be a bent 
function on Vn-5- Set f{xi, . . ,,Xn) = h{xi,X2, X3, X4, X5) © g(xe, ■ ■ .,x„). 

Let W be an (n — l)-dimensional linear subspace of Vn, composed of all 
the vectors in Vn, that can be expressed in the form (0, 02, . . . , a„), where each 
Qj S GF( 2 ). Write a\ = ( 0 , 0 , 1 , 0 , ... , 0 ), = ( 0 , 1 , 0 , . . . , 0 ) G Vn, It is easy to 

verify al,a^ £ 3 ? and IRnW = { 0 , a\, a^}- Due to (i) and (ii) of Theorem^ we 
conclude A{a\) = ± 2 ""\ A{a{) = ± 2 ^~^ and Nf = 2 ""i - 25("-L. 

We notice that by using Theorem ^ we have determined Nh, L\(a*) and 
A{a^ only from the information about #( 3 ?n W) for an (n — l)-dimensional 
linear subspace W although we do not search other the vectors in 3 ?. 

We can also find an example corresponding to (iii) of Theorem ^ All The- 
orems Q Q and Q and Examples Q Q and Q show that we can determine the 
nonlinearity of a function only from some information about ^( 3 ? H W), where 
W is an (n— l)-dimensional linear subspace. It is interesting that Q has proved 
that there exists no a function with ^ 3 ? = 3 while Example Hgives a function 
satisfying #( 3 ?n W) = 3 for an (n — l)-dimensional linear subspace W. 

7 Conclusions 

The strong linear dependence is an improvement on a previously known result. 
The unbiased distribution of non-propagation vectors is valid for most functions. 
These results provide more information on the non-propagative vectors in any 
(n— l)-dimensional linear subspace of Vn, and hence they are helpful for designing 
cryptographic functions. 
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Abstract. This paper studies the security offered by the block cipher 
E2 against truncated differential cryptanalysis. At FSE’99 Matsui and 
Tokita showed a possible attack on an 8-round variant of E2 without IT- 
Function (the initial transformation) and ET-Function (the final trans- 
formation) based on byte characteristics. To evaluate the security against 
attacks using truncated differentials, which mean bytewise differentials 
in this paper, we searched for all truncated differentials that lead to pos- 
sible attacks for reduced-round variants of E2. As a result, we confirmed 
that there exist no such truncated differentials for E2 with more than 8 
rounds. However, we found another 7-round truncated differential which 
lead to another possible attack on an 8-round variant of E2 without IT- 
or ET-Function with less complexity. We also found that the 7-round 
truncated differential is useful to distinguish a 7-round variant of E2 
with IT- and ET-Functions from a random permutation. In spite of our 
severe examination, this type of cryptanalysis fails to break the full E2. 
We believe that this means that the full E2 offers strong security against 
this truncated differential cryptanalysis. 



1 Introduction 



The attacks using truncated differentials were introduced by Knudsen 
It deals with truncated differentials, i.e. differentials where only a part of the 
difference can be predicted. Although the notion of truncated differentials he 
introduced is wide, with a byte-oriented cipher it is natural to study bytewise 
differentials as truncated differentials. The truncated differential can partly deal 
with the so called multiple-path for a Markov cipher which is a set of 

differential characteristics with the same input difference pattern and the same 
output difference pattern, hence the maximum probability of truncated differen- 
tial can be higher than that of differential characteristics. Moreover, the trun- 
cated differentials can allow the attackers more freedom in choosing plaintexts or 
ciphertexts. Therefore, studying the security against truncated differential crypt- 
analysis can provide a more strict evaluation of the security against differential 
cryptanalysis. 

A truncated differential cryptanalysis of reduced-round variants of E2 was 
presented by Matsui and Tokita at FSE’99 Their analysis was based on 
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the “byte characteristic,” where the values to the difference in a byte are distin- 
guished between non-zero and zero. They found a 7-round byte characteristic, 
which leads to a possible attack on an 8-round variant of E2 without /T-Function 
(the initial transformation) and FT-Function (the final transformation). 

This paper studies the security of F2 against this type of cryptanalysis. We 
show an algorithm which searches for all effective truncated differentials that 
lead to possible attacks of Feistel ciphers, which Matsui et al. didn’t go into 
details about in Here “effective” means that the probability of the trun- 

cated differential for the cipher is higher than the probability of the truncated 
differential for a random permutation. To run the algorithm above, we have to 
compute all non-zero probabilities of truncated differentials of the round func- 
tion. Since the round function of F2 has the SPN (Substitution Permutation 
Network) structure, we made use of the method for computing the maximum 
average of differential probability of general SPN structures shown by Sugita et 






As a result, we found another 7-round truncated differential, which leads to a 
possible attack on an 8-round variant of F2 without IT- or FT-Function with less 
complexity than that offered by Matsui et al. Moreover, this truncated differential 
was also useful in distinguishing a 7-round variant of F2 with IT- and FT- 
Functions from a random function. However, no flaw by the cryptanalysis above 
was discovered for the full 12-round E2, i.e. E2 in the specification submitted to 
NIST as an AES candidate 

The contents of this paper are as follows. First, in Section H we describe 
an algorithm to compute the probabilities for all truncated differentials of the 
round function with the SPN structure. Second, we show a search algorithm for 
the truncated differentials of E2 in Section J This algorithm is applicable to 
other ciphers with the Feistel structure. Section^ describes possible scenarios of 
attacks on reduced-round variants of E2 using the truncated differentials found 
in Section^and estimates the required complexity for attacking. 

2 Truncated Differentials of Ronnd Fnnction 

First, we show examples of the transition rules between the input and output 
bytewise differences of the round function of E2 and define the truncated differ- 
ential used in this paper. Throughout this paper we follow the notations used in 
the specification of E2 £3 (see also Figure^. The linear transformation in the 
round function (P-Function) is represented as follows. 



‘( z (, 4 ,---, 4 ) = P*{z1,Z2,---,Z8) 





108 



Shiho Moriai et al. 



A- 12) A 111 




Fig. 1. The round function of E 2 
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,ys,yi) 


and z 



( 1 ) 



{zi, Z2, ■ . ■ , Zb) be the 



input of the round function, the output of the round function, and the input 



of P-Function, respectively, and let Ax € (GF( 2 )®)®, Ay G (GF( 2 )®)®, and 
Az G (GF( 2 )®)® be the differences of x, y, and z, respectively. 



Ax = {Axi, Ax 2, ■ ■ ■ , Axs), Zixi G GF( 2 )® 
Ay={Ay2,...,Ays,Ayi), Z\?/i G GF( 2 )® 

Az = {Azi, Az2, ■ ■ ■ , Azb), Z\zi G GF( 2 )® 



(i = l,2,...,8) 



For example, when two bytes of the input X\ and x^ are changed, if Az\ = Az^, 
then three bytes of the output ?/2, ye, and yi are changed. Otherwise (i.e., if 
Azi yf Azb) all bytes except j/7 are changed. Assuming that the input values 
xi, X2, ■ ■ ■ , xs and the input differences Z\a;i and Ax 5 are given randomly (while 
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the other AxiS are fixed to 0 {i ^ 1)5)), the former event (Azi = Azs) occurs 
with approximate probability (though the exact value is 2 ^)) Ih® latter 
event (Azi ^ Az^) occurs with approximate probability 1 — 2“®. We describe 
the transition rules above between the input and output bytewise differences as 
follows. 



(10001000)^(10001001) p«2-® 

( 10001000 )^( 11111011 ) ppzl-2~^ 

The transition rules above are generalized to the transition rules between the 
input and output t-bitwise differences for a function with m x t bits input and 
output. We call these transition rules the truncated differentials of a function 
/ : (GF(2)*)"* 1 -^ (GF(2)*)"*. Formally, we define them as follows. 



Definition 1 (x-Punction). Let x be the function GF(2)* ^ GF(2) defined as 
follows. 



X{x) 



0 if a; = 0 

1 if a; 0 



Let x(xi,X 2 , . . . , a;^) = {x{xi),x{x 2 ), ■■■, x{xm))- 



Definition 2 (Truncated Differential). Let 6x, 6y G GF(2)"* denote the in- 
put differential and output differential of the truncated differential of the function 
f, respectively. 

Sx = (Sxi, Sx 2 , ■ ■ ■ , Sxm), Sxi G GF (2) (i = 1,2, . . .,m) 

5y= {5yi,5y2,...,5ym), % G GF(2) 

where Sxi = xi^Xi) and Syi = xi^Vi)- 

Let pf{5x, Sy) denote the probability of the truncated differential of the function 
f. pf{Sx,6y) is defined by 

Pf{Sx, Sy) = max , Pr ,,, [ x{f{x) © f(x © Z\a;)) = Sy ] (2) 

Ax ^ 0 a:G(GF(2)*)’" 

x(Zia:) — 5x 

We define the pair of Sx and Sy as the truncated differential of the function f 
and represent it as follows: 

Sx — > Sy with probability Pf{Sx, Sy). 



To search exhaustively for the effective truncated differentials of the whole 
cipher, we need to derive all possible truncated differentials of the round function 



with non-zero probability. Sugita 



showed a method for calculating the 



maximum average of differential probability of the SPN structure, assuming that 
the differential probability of the s-boxes is uniformly distributed for any nonzero 
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input difference and any nonzero output differenc^ According to Sec- 

tion 5] , we can calculate efficiently the probabilities of the truncated differentials 
of the round function of E2, pp{ 6 x, 6 y), for every 6 x, 6 y S GF(2)®. We begin by 
introducing the following semi-order ^ in GF(2)"*. 

Definition 3 (Semi-order). For every 6 x, 6 y G GF(2)"*, we define the semi- 
order ^ in GF(2)"* as follows. 

Sx ^ Sy (1 < j < rn); Syi = 0 Sxi = 0, 

where Sx = (Sxi, 6 x 2 , ■ ■ ■ , Sxm) and Sy = (<5yi, Sy 2 , ■ ■ ■ , Sym)- 

Algorithm 1 (Calculation of the Probabilities of Truncated Differen- 
tials of the Round Function of E2). 

1. For every Sz, Sz' G GF(2)®, we define M((5z, Sz’') for P-Function as follows. 

M((52, bz ) 

= #{(Ziz,Ziz') e ((GF(2)®)® \ {0})^ I Az = PAz,-x.{Az) < Sz,x(Az') ^ Sz'}, 

where Sz = (Szi,Sz 2 , Szs), Szi = x(Azi) 

Sz' = ((5 Pi, (54, . . .,(54), (54 = x(A4) 

M.{Sz,Sz') can be easily calculated by a simple rank calculation as follows. 

( PE \ 

M{Sz,Sz') = 2 \T{Sz,Sz)) _ 

where Sz and Sz are the complements of Sz and Sz' , respectively. P denotes 
the matrix represented by Equation Q, E denotes the 8x8 identity matrix, 
and T(Sz,Sz ) denotes the 16 x 16 diagonal matrix whose (z, i) component 
equals Szi for z = 1, • • • , 8, and (5zi_g for z = 9, • • • , 16. 

2. For every Sz, Sz' G GF(2)®, we define N(5z, Sz') for P-Function as follows. 

N((52, Sz') 

= #{(Az,Az') e ((GF(2ff \ {0}f I Az = PAz.xi^z) = Sz,x(ZSz’) = Sz} 



N((5z, Sz') can be calculated recursively, using the following relation 

N((5z, Sz') = M((5z, Sz') - ^ N((5~z, Sz) (3) 

{Sz,Sz )^{Sz,Sz') 

* Strictly speaking, let ris{Ax, Ay) = fi^{x\s{x) ® s{x ® Ax) — Ay} and the following 
is assumed: 

f 2 ^ if Ax Ay 4 0 
Us = < 1 if Ax — Ay = 0 

( 0 otherwise. 



sisnui! 
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3. For every 5x, 6y G GF(2)®, calculate the probability of truncated differential 
of the round function pp{Sx, Sy) according to the following relations 

PF { Sx , Sy ) = ^^N{Sz,Sz')q'^'^^^^^ps{Sx,Sz) (4) 

Sz 

PsiSx, 6z) = max Pr [ 0 S(x 0 Ax)) = 6z ] (5) 

Ax^O xG(GF(2)8)8^ ^ J V / 

— Sx 

where W}i(Sy) denotes the Hamming weight of Sy and q is the maximum av- 
erage of the differential probability of s-box. We have q — 1/(2® — 1) under 
the assumption that differential probability of the s-boxes is uniformly dis- 
tributed for any nonzero input difference and any nonzero output difference. 



3 The Search for Truncated Differentials of E2 

In this section we search for all truncated differentials that lead to possible 
attacks on (reduced-round variants of) E2. Below we show a search algorithm 
for all “effective” truncated differentials of a Feistel cipher with R rounds and 
blocksize 2mt bits. In this paper, “effective” means that the truncated differential 
could lead to possible attacks, in other words, the probability of the truncated 
differential is equal or higher than the probability with which the truncated 
differential holds for a random permutatioi^J 

This search algorithm consists of recursive procedures. Note that the search 
algorithm is the depth first search rather than the breadth first search considering 
the required memory. The “depth” corresponds to the number of rounds of the 
Feistel cipher. 

Algorithm 2 (Search for all Effective Truncated Differentials of Feistel 
Cipher with R Rounds and Blocksize 2mt Bits) 

Let g GF(2)"* be the input and output differences of the trun- 
cated differential of the r-th round function. Thus X^^'>) is the truncated 

differential of the plaintext. Let Vr be the variable which holds the probability of 
the r-round truncated differential. Vo should be initialized to be 1, i.e.,Vo := I. 

1. Galculate all the probabilities of the truncated differentials of the round 
function pf{Sx, Sy). They should be sorted in order of the probability of 
truncated differentials for each input difference. 

2. For all truncated differential of the plaintext, X^^^ G GF(2)'" and 

G GF(2)"*) call the procedure [The 1st round], i.e., the procedure 
[The r-TH round] for r = 1. After finishing the procedure [The 1st round] 
for all and exit the program. 



** Although there is a claimed attack on the first 16 rounds of Skipjack using the 
truncated differential with smaller probability than a random permutation 
we are not concerned here with this case. 
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3. [The t-th round] For each set the output truncated differential of the 
round function € GF(2)"* in order of the probability of the truncated 
differential. 

- Let pr 

— If Vr-i X Pr < then try another 

— Call the procedure [The r-XH Xor]. 

If r yf I, return to the procedure [The (r — l)-sx Xor], 
otherwise {i.e.,r = I), return to Step 2. 

4. [The r-XH Xor] At the XOR operation of the r-th round in the Feistel 

cipher, is derived from and Here the difference may 

be canceled out: 1 0 1 = 0 with probability 255 (~ 2“*), while 1 0 1 = 1 
with probability |||, assuming that the difference is independent and uni- 
formly distributed. When the cancelation occurs c times, the probability 
is approximately (2“*)'^. The number of all possible values of is 

2 wh(x^ b, Por each call the following procedure. 

- Let Vr := Vr-i xprX (2-*)b where c = VT^) - wh(A(’'+i)). 



— If "Pr < 2 then try another 

— If Vr is lower than the probability for a random function, 
i.e., if Vr < then try another 

— If r < i?, call the procedure [The (r 0 l)-sx round], 
else print the truncated differential: 

(X(o) , ^ , X^^'>) with probability Vr. 

Return to the procedure [The (r — l)-sx round]. 

At the procedure [The r-XH Xor] of each round in the algorithm above, if 
the probability of the r-round truncated differential is lower than the probability 
with which the truncated differential holds for a random permutation, we don’t 
have to continue the search for the truncated differential for longer rounds. This 
makes the search efficient by pruning off unnecessary candidates. This is because 
the following theorem holds for Feistel ciphers. 

Theorem 1. 



Vr+1 < 



( 6 ) 

( 7 ) 



Proof) We have 



Vr+l =VrX Pr+1 X (2 *)0 
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From Equation Q and since c = V — holds, where 

c is the number of times when the cancelation happens in the procedure [The 
r + 1-TH Xor], we have 

Vr+l < X Pr+1 X 

ITe have Pr+i < 1 and 2 *(™h(^‘- Vr* + '>))) ^ since 

wh(xM) - (u>h(xM V y(’'+i))) < 0 



holds. Therefore, 

■p ^ 2t{wH{X^^^)+WH{X*-^'^^^))-2mt 

holds. The proof is complete. 



4 Attacks on Reduced-Round Variants of E2 



Using Algorithm 2 we searched for all truncated differentials that lead to possible 
attacks for reduced-round variants of E2. As a result, we confirmed that ther^^ 
ist no such truncated differentials for E2 with more than 7 rounds. The bes^^] 
7-round truncated differential that leads to possible attacks on reduced-round 
variants of E2 is shown in Figure H This 7-round truncated differential holds 
with probability of about for a random round function the probability 

of the truncated differential is expected to be (2“®)^'* = which is signif- 

icantly smaller. Therefore, this truncated differential is useful to derive subkey 
information of the last round and to distinguish from a random permutation. 

Moreover, this 7-round truncated differential is connected with the truncated 
differentials of IT- and FT-Functions with probability about 1. In IT- and FT- 
Functions, 32-bit multiplications with subkeys are used. Since this multiplication 
is modulo 2®^ (roughly speaking, the upper 32-bit of the resultant 64-bit is 
discarded), this multiplication has the following trivial truncated differential as 
shown in 



(1000)^(1000) p«l 



* * * Here the “best” means that the ratio of the probability of the truncated differential 
to the probability for a random permutation is the highest. 

^ The more strict probability we computed was 248q'^^-|-18(ir^®-|-39(/^® -|-157q'^^-|-22(jt^® -|- 
62gi® -b225g^° -b 158g^^ -b 172g^2 -b -b205g^'‘ -b202g^® -b 189g^® -b 194172 ’’ +246<7^® -b 

1371729 + I 0 g®° -b 93(7®’ + 37g®2 + 173^33 g^34 ^g2g3s 74^36 ^ 156 ^^ 

28g®® -b ■ • ■ , while the probability of the best known truncated differential 
was computed as -b231g’® -b 168g’® -b 135g’’’ -b 115g’® -b 157g’® -b 163g2° +217g2i + 
90g22 + 208g23 + 59^24 ^ g4^25 gg^26 45g^27 430^29 250g3o + 227g®’ -b 

102g®2 + 118^33 40^34 246g®® -b 13g®® -b 146g®'^ -b 98g®® -b 153g®® + 8 g^° -b • • •, 

where q is the maximum average of the differential probability of s-box. Under our 
assumption q = 1 /( 2 ® — 1 ) for any nonzero input difference and any nonzero output 
difference of s-box. 
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Fig. 2. The best 7-round truncated differential of E2 



Hence the 7-round truncated differential shown in FigureOcan skip IT- and FT- 
Functions with probability about 1. Additionally, the positions of the bytes which 
have a non-zero difference are not changed by HP-Function (or PP“^-Function) 
in IT-Function (or FT-Function). It follows that we have the following truncated 
differential connecting the plaintext and ciphertext for a 7-round variant of F2. 

(10001000 00000000) ^ (10001000 00000000) p K. 2-1°^ 

4.1 E2 Reduced to 8 Rounds without IT- or FT-Punction 

We show a possible scenario of an attack of F2 reduced to 8 rounds without 
IT- or FT-Function. Below we show an attack to derive the last round key (the 
round key the 8-th round) without FF-Function. 

Prepare the chosen plaintexts with the difference pattern (10001000 00000000) 
and guess the last round keys and get the output of the 7-th round using the ci- 



Security of E2 against Truncated Differential Cryptanalysis 115 



phertexts. According to Section 5, Lemma 1], when we have chosen 

plaintext pairs, if the number of pairs whose output differences follow the dif- 
ference pattern (10001000 00000000) is more than 20, we can judge the guessed 
key is right and otherwise we can judge the guessed key is wrong. For a correct 
key the probability that the number of pairs whose output differences follow 
the difference pattern (10001000 00000000) is more than 20 is 99%, while the 
probability is for a wrong key. 

The required 2^°® chosen plaintext pairs can be generated from 2®^ chosen 
plaintext blocks (94 = 109 - 16 -1- 1). The attack on an 8-round variant of E2 
without IT- and FT-Functions shown in required 2^^^ chosen plaintext 

blocks. Moreover, we do not have to choose special plaintexts Section 

5.2] since the probability that correct pairs are detected is much larger than the 
probability that wrong pairs appear. 

Note that the complexity of the procedure above for deriving the last round 
keys (128 bits) exceeds the complexity of exhaustive search 0(2^^®). We’ve not 
confirmed whether an improved attack with complexity less than 0(2^^®) is 
possible. 



4.2 E2 Reduced to 7 Rounds with IT- and FT-Fhnction 

The 7-round truncated differential shown in Figure ^ is also useful to distin- 
guish the 7-round variant of E2 with IT- and FT-Functions from a random 
permutation. 

Prepare the chosen plaintexts with the difference pattern (10001000 00000000) 
and observe the differences of the ciphertexts. According to Matsui et al.’s the- 
ory, when we have 2^°® chosen plaintext pairs, if at least one pair follows the 
difference pattern (10001000 00000000), we can regard it as the 7-round variant 
of E2 with IT- and FF-Functions, otherwise we regard it as a random permu- 
tation. The probability that the number of pairs whose output differences follow 
the difference pattern (10001000 00000000) is more than 1 is 98%, while the 
probability is 2% for a random permutation. The required 2^°® chosen plaintext 
pairs can be generated from 2®^ plaintext blocks (91 = 106 -16-1-1). 

5 Conclusion 

This paper introduced search algorithms for finding effective truncated differen- 
tials useful in truncated differential cryptanalysis. Applying to E2, we found an 
attack on an 8-round variant of E2 without IT- or FF-Function requiring 2®^ 
chosen plaintexts, which is fewer than that required by the best known attack. 
We also found that it is possible to distinguish a 7-round variant of E2 with IT- 
and FF-Functions from a random function using 2®^ chosen plaintexts. 

In spite of our severe examination, this type of cryptanalysis fails to break 
the full E2. We believe that this means that the full E2 offers strong security 
against truncated differential cryptanalysis. 
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Table 1. Attacks on reduced-round variants of E2 



Attack 1: Extract the last round key information 



Matsui et al.’s result 


8-round E2 without IT and FT 
2^^^ chosen plaintexts 


Our result 


8-round E2 without IT or FT 
2®“* chosen plaintexts 



Note) The attack complexities may be above 0(2^^®). 



Attack 2: Distinguish from a random permutation 



Matsui et al.’s result 


7-round E2 without IT and FT 
2®^ chosen plaintexts 


Our result 


7-round E2 with IT and FT 
2®^ chosen plaintexts 



Appendix: Truncated Differentials of Rijndael 



We also searched for truncated differentials of Rijndael under the similar 

condition that the differential probability of the S-boxes is uniformly distributed 
for any nonzero input difference and any nonzero output difference. In the 

designers stated that, for 6 rounds or more, no attacks faster than exhaustive key 
search have been found. For differential characteristics of Rijndael, they stated 
that it can be proven that there are no 4-round differential trails with a predicted 
prop ratio (probability) above 2“^®°. 

As the result of our search, there existed no truncated differentials for Ri- 
jndael with more than 4 rounds that has higher probability than a randomly 
chosen permutation. There existed 218,700,000 4-round truncated differentials 
that has the same probability as a randomly chosen permutation. For differen- 
tials of Rijndael, we found a 5-round differential with probability 1.06 x 2“^^®, 
and there existed no differentials for Rijndael with more than 5 rounds that have 
higher probability than 
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Abstract. DEAL is a six- or eight-round Luby-Rackoff cipher that uses 
DES as its round function, with allowed key lengths of 128, 192, and 256 
bits. In this paper, we discuss two new results on the DEAL key schedule. 
First, we discuss the existence of equivalent keys for all three key lengths; 
pairs of equivalent keys in DEAL- 128 require about 2®^ DES encryptions 
to find, while equivalent keys in DEAL-192 and DEAL-256 require only 
six or eight DES encryptions to find. Second, we discuss a new related- 
key attack on DEAL-192 and DEAL-256. This attack requires 2®® related 
key queries, the same 3 plaintexts encrypted under each key, and may 
be implemented with a variety of time-memory tradeoffs; Given 3 x 2®® 
bytes of memory, the attack requires 2^^® DES encryptions, and given 
3 X 2 ^^ bytes of memory, the attack requires 2^®^ DES encryptions. We 
conclude with some questions raised by the analysis. 



1 Introduction 



In June 1998 the National Institute of Standards and Technology (NIST) re- 
ceived fifteen candidate algorithms for the Advanced Encryption Standard 
(AES). The AES would eventually replace DES as a federal encryption standard, 
and hopefully would become a world-wide encryption standard as well. 

One of the hardest aspects of cipher design is the key schedule. Numer- 
ous AES submissions have been attacked through their key schedule: SAFER-b 
I, Crypton in , DEC I 






_ in 

genta in MARS and RC6 

These attacks have ranged fro m findin g equivalent keys to weak key classes to 
related-key differential attacks 



I, and have generally not 
been serious. Still, equivalent or related keys can make the cipher unusable as a 
hash function (for example, in Davies-Meyer feed forward mode and 

can reduce the effective keyspace of the cipher^^^^J; related-key differential 
attacks can cause vulnerabilities in applications where related-key queries are 
legitimate |. Weak key classes can mean that a percentage of 

the keys are vulnerable to attack. 

One of the submissions was DEAL (Data Encryption Algorithm with Larger 
blocks) ^ 



Intended as the conservative choice, DEAL was designed to 



Howard Keys and Carlisle Adams (Eds.): SAC’99, LNCS 1758, pp. 118-^^^ 2000. 
@ Springer-Verlag Berlin Heidelberg 2000 
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leverage the cryptographic confidence in DES while creating a new cipher with a 
128-bit block and key lengths of 128-, 192-, and 256-bits. In this paper, we refer 
to DEAL with an n-bit key as DEAL-n. Thus, we have DEAL-128, DEAL-192, 
and DEAL-256. 

|, an attack was presented for DEAL-192, with a number of possible 
tradeoffs given between number of chosen-plaintext queries, and amount of work 
done for the attack. The best attack in terms of computational resources requires 
2®® chosen plaintexts, about 2^^® encryptions’ worth of work, and about 2®^ 
memory locations. With 2'*° bits (2^^ bytes) of memory. Lucks’ best attack on 
DEAL-192 requires 2^^ chosen plaintexts, and work equivalent to about 6 x 2^®® 
DES encryptions (about 2^®® DEAL encryptions). 

In a number of impractical attacks are discussed on DEAL-192. 

There is a straightforward meet-in-the-middle attack on DEAL-192 requiring 
about 2^®® work and 2^^® bytes of memory, requiring only three known plain- 
texts. The memory requirements are totally unreasonable, and trading off time 
for memory does not yield an attack with reasonable memory requirements and 
less work than brute-forcing the key. There is also a general attack on 6-round 
Feistel ciphers with bijective F-functions, based on a 5-round impossible trun- 
cated differential. Applying this attack to DEAL-192 gives an attack with 2^^® 
work, 2^® chosen plaintexts, and 2®® bytes of memory. The chosen-plaintext re- 
quirements make this attack totally impractical. No attacks on DEAL-256 faster 
than exhaustive search were discussed. 

1.1 Our Results 

In this paper, we present the following results against DEAL: 

— Equivalent keys for DEAL-192 and DEAL-256, with an algorithm to find 
them. The algorithm requires about six DES encryptions to find a set of 256 
equivalent DEAL-192 keys, and eight DES encryptions to find a set of 256 
equivalent DEAL-256 keys. 

— Equivalent keys for DEAL-128, with an algorithm to find them. The algo- 
rithm requires about 2®"^ work to find a pair of equivalent keys. 

— A related-key attack on DEAL-192 and DEAL-256, requiring three plaintexts 
under 2®® keys with a certain relationship, 3 x 2^® bytes of memory, and 
about 2^®^ DEAL encryptions’ work, to find the last two rounds’ subkeys for 
DEAL-192 and DEAL-256. (With more memory, this can be made faster.) 

— A number of possible extensions to these attacks. DEAL-192 can be peeled 
down to four rounds, and then Biham’s attack on four-round Ladder-DES 
can be applied^B^Bi DEAL-256 can be peeled down to six rounds, and 
then Lucks’ attack on six-round DEAL-192 can be applied. Alternatively, 64 
bits can be recovered from the original key, and the remainder brute-force 
searched. 

Importance of the Results. These results have both practical and theoretical 
interest. 
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DEAL is likely to see some use in the future in real-world systems. DEAL 
is an AES candidate, but even if it is not accepted as an AES finalist, it will 
almost certainly see some use. The general idea behind DEAL is a sound one, 
and has been proposed several times before As pointed out by 

Outerbridge at the first AES conference, widespread availability of DES hard- 
ware in many different environments makes DEAL relatively easy to implement 
in many different environments, at very low cost. A system designer in need of a 
128-bit block cipher, and in possession of lots of DES-enabled devices, might do 
well to choose an algorithm like DEAL. (Certainly, he would be better off doing 
this than trying to design his own cipher from DES.) 

In real-world use, the equivalent keys of DEAL have important practical 
implications^hey make many standard hashing modes, e.g. Davies-Meyer mode, 
unsafe to us® 

The related-key attacks are probably somewhat less practical, but may still 
be important in some applications. These attacks have the effect of peeling off the 
last two rounds of DEAL at the cost of about 2^^^ DEAL encryptions of work, 
using about 3 x 2^® bytes of memory, and requiring the same three plaintexts 
be encrypted under 2^^ related keys. There are various time-memory tradeoffs 
available. 

In the presence of 3 x 2®® bytes of random-access storage, the attack will 
run with about 2^^® work, again recovering the last two round subkey J At 
that point. Biham’s attack on 4-round Ladder-DES can be mounted, 

requiring another 2®® chosen plaintexts (under only one key) and 2®® time. The 
whole attack thus takes about 2^^® work, 3 x 2®® bytes of random-access storage, 
3 known plaintexts encrypted under 2®® related keys, and 2®® chosen plaintexts 
under one of those keys, to be selected after the rest of the attack has run its 
course. This compares with the best previously known attack, which required 
2^1® work, 2®^ memory, and 2^® chosen plaintexts. 

On a theoretical level, our results demonstrate an important fact: It is widely 
assumed that a key schedule that uses strong cryptographic components will, in 
practice, not be vulnerable to cryptanalysis. This assumption has motivated a 
number of ciphers’ key schedules, including those of Khufu Blowfish 

and SEAL ^2^9- This assumption, unfortunately, isn’t always true. In 
DEAL, a strong cipher is used in an apparently-reasonable way to process key 
material. However, the method used leaves the cipher vulnerable to related-key 
cryptanalysis, as well as allowing the existence of equivalent keys. 



1.2 Guide to the Rest of the Paper 

The rest of this paper is organized as follows: We first discuss the DEAL cipher 

and key schedule in the level of detail required for our attacks. We then discuss 

1 jjj it is noted that the slow key schedule of DEAL makes it a poor choice 

for hashing applications. 

^ This assumes that 3 x 2®® bytes of random-access storage can be found and used 
efhciently-in practice, this attack is of no practical significance, though variant at- 
tacks with lower memory requirements may be. 




Key-Schedule Cryptanalysis of DEAL 121 



equivalent keys in DEAL-192 and DEAL-256, both how to find them and how 
many there appear to be. Next, we discuss equivalent keys in DEAL-128. After 
that, we discuss related-key differential attacks on DEAL-192 and DEAL-256. 
We conclude with a summary of our results, and some questions raised by them. 



2 The DEAL Cipher and Key Schedule 



DEAL is a cipher designed originally by Lars Knudsen and submitted 

for the AES by Richard Outerbridge. DEAL uses the DES as the r ound fu nction 
of a larger balanced Feistel cipher in a Luby-Rackoff construction 
DEAL works as follows: 

Let A, B be the left and right 64-bit halves of the input block, respectively. 
Let i?o..Af-i be the round subkeys, which are 64 bit blocks that are used as 56-bit 
DES keys, by ignoring the parity bits. Encryption is as follows: (Here, we show 
8 rounds.) 



A = A®Er^{B) 

B = B®Er,{A) 

A = A®Er^{B) 

B = B®Er^{A) 

A = A®Er^{B) 

B = B ® Er^(A) 

A = A®Er^{B) 

B = B®Er^{A). 

DEAL has 6 rounds for 128- and 192-bit keys, and 8 rounds for 256-bit keys. The 
key schedule works as follows, where E{X) means X encrypted under a constant 
key used only for key scheduling, and ATo ..3 are the four 64-bit blocks that make 
up a 256-bit key. (The key schedules for DEAL-192 and DEAL-128 are very 
similar to the key schedule shown below for DEAL-256, but with only six round 
keys generated, and only three or two 64-bit blocks of input key material.) 



Ro = E{Ko) 

Ri — E{Ki 0 Rq) 
i?2 = E(K2 0 Rl) 

i?3 = E (ATa 0 R 2 ) 

i?4 = E{Ko 0 i?3 © 1) 
R 5 = E{Ki 0 R 4 0 2) 
Re = E{K 2 0 As © 4) 
i?7 = E{Ke © i?6 © 8). 



The Ri values are used only as DES keys, and so their parity bits are ignored. 
This turns out to be very important for our analysis. 
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3 Equivalent Keys in DEAL-192 and DEAL-256 

An encryption algorithm has equivalent keys when there are two or more keys, 
K^K*, such that K ^ K* but Ek{X) = Ek*{X) for all X. Equivalent keys 
can reduce the effective keyspace of an algorithm in some cases, and if pairs of 
keys can be efficiently found, render the encryption algorithm unsafe to use in 
hashing modes. 

We have an algorithm for finding sets of 256 equivalent keys in DEAL. For a 
special class of weak keys consisting of 2“®^ of all keys of length 192 or 256, it is 
always possible to find sets of 256 equivalent keys. Further, an efficient algorithm 
exists to find weak keys of this type. Equivalent keys also exist for DEAL-128, 
but a very different algorithm is needed to find these keys, and they are discussed 
in the next section. 



3.1 The Algorithm to Find Sets of Equivalent Keys 

Consider the DEAL key schedule again: 

Ro = E{Ko) 

R\ = E{K\ 0 i?o) 

= E[K 2 © Rl) 

i ?3 = A (A 3 © R 2 ) 

R 4 = E{Ko © i?3 © 1) 

R 5 = E{Ki © i?4 © 2) 

Re = E{K 2 © i?5 © 4) 

A7 = E(A3® i?6©8). 

Our general strategy will be as follows: 

1. Find a “weak key” such that i?o = i ?3 © 2 for 192-bit keys, or such that 

= i?5 © 4. 

2. Choose Z\ active only in parity bits. 

3. Let: 

K* = Ko 
K* = Ki 

K* = D{R 2 © Z\) © Ai 

A3* = A3 © Z\ 

4. The result is a sequence of round subkeys such that: 

Ro = R*o 
Rl = R*i 
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We choose as: 



R2 = R*2' 

R3 = R*3 
Ri = R\ 
R5 = Rt 
Re = R*6' 
R 7 = R*7 



k; = K 3 1 



z\ 



z\ 



z\ 



K; = D{R2 © Z\) 0 i?i 



This gives us a pair of equivalent keys: 

(iCo, Ki,K2,K3){Ko, KuK*, K*). 



In fact, for each A satisfying the above-mentioned requirements, we get a 
key equivalent to {Kq, Ki, K2, K3). The result is that we get a family of 256 
equivalent keys, since there are 256 A values (including zero) that satisfy the 
requirements for A to be active only in parity bits.) 

3.2 Why It Works 

Let’s consider the values of subkeys between the two related keys: overflowing as 
the equation was long. Had to seperate them. {Kq, Ki, K2, K3), 

Recall that: 

Ri = R 5 ® 4 

K* = K3(B A 

K* = D{R 2 ® A)® Ri 

Also, recall that i?, © Z\ is equivalent to Ri, so long as A is active only in its 
parity bits. 

1. There is no change in Kg, Ki, so there can be no change in Rg, R\. That is. 

We know that: 

Kg = K*g 

Ki = K* 

Therefore: 

Ro = R*g 

Ri = R\ 
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2. i?2 = -RJ ® ^ because 

We know that: 

K* = D{R2 ®A)®Ri 
= E[K2 © Rl) 

Therefore: 

R*2 = E{K* © R\) 

= E{D{R2 © Z\) © © Ri) 

= E{D{R2(BA)) 

= R2® A 



3 . R3 = R3 because: 



We know that: 

Kl =Ks(B A 
R3 = E{K^ © R2) 
R*2=R2® A 
Therefore: 

R% = E{KI © R*) 

= E{K3 © Z\ © i?2 © ^) 

= E{K3 © R2) 

= R 3 . 



4 . R4 and R5 are unchanged, (that is, R4 = RX^R^ = Rg) because R4,Rs are 
dependent only upon Kq, Ki, andR3, and we have already established that 
those values are all unchanged. 

R4 = E{Ko © R3 © 1 ) 

D* 

— Jt 4 

Rg = E{Ki © i?4 © 2 ) 

= Rs 



5. Rg = Rg © 2 \, because 

We know that: 

Rg = Ri © 4 
Ri = RX 
R5 = Rt 

Rg = E{K 2 © Rs © 4) 
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= E{K2 ©i?i 0 404) 

= E{K2(BRi) 

= R2 

Therefore: 

R*e = E{K* 0 i?5 0 4) 

= E{K*®Ri) 

= R* 

= R 2 ® A 

And thus: 

i ?6 = ^6 0 ^ 

6. Finally, R 7 = Rj because 

We know that: 

i?7 = E{K:i 0 i?6 0 8) 

RI = Rq® A 
Ks = KI® A 
Therefore: 

R*^ = E{Kl 0 0 8) 

= E{K^ 0 A 0 i?6 0 ^ 0 8) 
= E{K:i 0 i?6 0 8) 

= Ri 



3.3 Effect on the DEAL Keyspace 

This set of equivalent keys has essentially no effect on the size of the effective 
keyspace, since it applies only to such a tiny fraction (about 3 * 2“®^) of special 
keys. 

3.4 Extensions 

A variant of the same algorithm works with ATi, K 2 or Aq, Ki as the active pair 
of key blocks. A variant of the algorithm can be carried out against DEAL- 
192. Against DEAL-128, a much more complex algorithm can be used to find 
equivalent keys, as will be discussed later in this paper. 

3.5 Efficiently Finding Equivalent Keys 

The naive algorithm for finding equivalent keys would be to try about 2®^ dif- 
ferent keys, waiting until R\ = R 5 ® 4. This has complexity 2®^, and thus is no 
easier than looking for a collision in a 128-bit hash function, such as might be 
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built from DEAL in Davies-Meyer hashing mode. However, the search for a class 
256 of equivalent keys can be converted to a straightforward algebra problem, 
as follows: 

1. Choose ATo,i ,2 arbitrarily. 

2. Derive: 



Ro = E{Ko) 

R\ = E{K\ 0 i?o) 
i?2 = E[K2 0 R\) 

3. Use the requirement that i ?5 = 0 4 to derive: 

i?5 = i?i 0 4 

= E{Ki 0i?4 0 2) 

Thus: 

= A>(^5) © All 0 2 

4. Having learned i? 4 , we next compute i? 3 , and thus K^\ 

Ri = E{R3 0 ATo 0 1) 

= D(i?5) 0 Ail 0 2 
Thus: 

Rs = D{D{R3) 0 Ail 0 2) 0 ATo 0 1 
= E{R2 0 Ala) 

Thus: 

A'a = D{R3) 0 i?2 

= D{D{D{R3) 0 Ail 0 2) 0 ATo 0 1) 0 i?2 

5. With A'o. 1 , 2 , 3 , we now have a “weak” key. 

The process is nearly identical with DEAL-192. 



4 Finding Equivalent Keys in DEAL-128 

in DEAL- 
equi valent 
algorithm, 

this algorithm requires about 2®'* runs of the DEAL key schedule to find a single 
pair of equivalent keys. 

® We are indebted to David Wagner for pointing out the possibility of finding equiva- 
lent keys in DEAL-128, and proposing another, earlier method for finding them. 



In this sectioiB we discuss an algorithm for finding equivalent keys 
128. Unlike the previous algorithm, this does not find classes of 256 
keys, but instead pairs of equivalent keys. Also unlike the previous 
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4.1 An Overview of Our Method 

The goal is to find a pair of keys, K, K*, such that R0..5 and -Rq-.s either 

equal or equivalent (equal in all bits except their parity bits, which will be ignored 
by the DES key schedule). 

4.2 The Algorithm 

1 . For each A active only in parity bits: 

(a) For each Kg value from 0 to 2 ®"^ — 1 : 

i. Compute Kq = D{E{Kq) 0 A) 

ii. Compute Ki = D{1) 0 E{Kq) 

iii. Compute K* = Ki(B A 

iv. Use Ao,i to compute R 0 .. 5 , and Kq ^ to compute RJ 5. 

V. Note that R0..3 and R0..3 are now equivalent: 

Ro = R 5 0 A 
Ri = Rt 
R 2 = R; 0 A 
Rs = R*3 

vi. Check to see whether R4 0 R4 = A This should happen with prob- 
ability 2"®4 

vii. If so, we’re done; R5 will also equal R5. If not, we must keep looking. 



4.3 Why It Works 

1 . Ro = Rq 0 A because 

D{E{Ko) 0 A) 
E{Ko) 

E{K*) 

E{D{E{Ko) 0 A)) 
E{K0) 0 A 
Ro© A 



We know that: 

K* = Ri 0 A 

R*o = Ro(BA 



We know that: 

K* = 
Ro = 
Therefore: 

Hq — 



2 . R* = Ri because: 
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R\ — E(Rq 0 Ki) 

Therefore: 

Rl=E{R*®K*) 

= E{Ro 0 Z\ 0 KTi © Z\) 

= E{Ro®Ki) 

= Ri 

3. = 1, because 

We know that: 

= D{1) 0 E{Ko) 

= D(l)0i?o 

Therefore: 

Ri = E(Rq 0 Ki) 

= E{Ro (B E{Ko) (B D{1)) 
= E{Ro(BRo<BD{1)) 

= E{D{1)) 

= 1 



4. Ri = 1 is necessary so that = -^2 © 

We know that: 

R* = E{K*) 

— Kq 

Ri = 1 
= RI 

i?2 = E{Ri 0 1 0 KTo) 
= E{Ko) 

= Ro 

Therefore: 

R* = E{Rl(Bl(BK*) 
= E{Ri(BUBK*) 

= E{Ko) 

— Kq 
= Rq® A 
= R 2 ® A 
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5. i ?3 = i ?3 because 

We know that 

R* = R2® A 
K* =Ki® A 
Therefore: 

i?* = E{R* 0 2 0 Kl) 

= E{R2 0 Z\ 0 2 0 iCi 0 Z\) 

= E{R2®2® Ki) 

= Rs 

6. We keep trying different values for (Ko,Ki) until we see = R 4 ® A. 

7. i ?5 = i ?5 because 

We know that: 

R*4 = R4® A 
K* = Ki® A 
Therefore: 

Rt, = E{R\ 0 8 0 K{) 

= E{R4 0 d 0 8 0 iCi 0 Z\) 

= E{R4®8® Ki) 

= R5 

5 Related-Key Attacks on DEAL-256 and DEAL-192 

Consider the algorithm for finding equivalent keys in DEAL-256. If we applied 
the algorithm without the special key property that 0 4, we would end 

up with nearly equivalent keys: key with the same subkeys for all but the last two 
rounds. We could then mount an attack based on this fact, given encryptions 
from the two keys. 

Here, we will discuss a related-key attack based on finding a pair of nearly- 
equivalent keys. We will discuss several issues with this attack, and then present 
the whole attack: 

— How to detect that we have a pair of nearly-equivalent keys. 

— How to use detection to learn information about the key. 

— How to extract the last two rounds’ subkeys when this property holds. 

— How to mount the full attack. 

5.1 Detecting Nearly-Equivalent Keys 

Given three plaintext/ciphertext pairs from a pair of keys, (iC, K*) believed to be 
nearly-equivalent, we can determine whether they have this property with very 
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high probability of being right, at the cost of about 2®"^ work and about 3 x 2®® 
memory locations. We mount something very similar to the meet-in-the-middle 
attack on double DES encryption. 

Consider one text, broken into two 64- bit halves, {Aq^Bq). All but the last 
two rounds of encryption are identical between the keys, so after the identical 
rounds, we get (Co, do) for this plaintext under both keys. The last two rounds 
are different, so we get (Tbj ^o) from K, and (To*, ^o*) from K* . 

Note that: 



■^0 = do 0 En^{Yo) 

Zq* = do 0 Eh*{Yq*). 



We know three plaintext, ciphertext pairs, so we know three different sets of 
To, To*, and Zo,Zq* values. We can mount a DES keysearch effort on i?7 and 
i?7. We try all 2®® possible values of R 7 , and for each one, we get candidate do 
values from all three plaintexts. We do the same for all possible values of Rj. 
We get two tables of 2®® different 192-bit values, which must be sorted. We then 
find the matches between the two tables. For 192-bit keys, the keysearch would 
be on i?5 and R^. 

If we find a pair of matching values, it is overwhelmingly likely that we have 
found the right values for Rr, Ry (or ^5,^5). 

This shows how to determine whether a pair of keys is near ly-equi valent, but 
not how to find which pair in a batch of 2®® of them is nearly-equi valent. 

Imagine a situation in which we had unlimited memory resources. We could 
do the same kind of meet-in-the-middle computation described above, but on all 
2®® keys. This would take 2®® x 2®® = 2®® encryptions, 89 x 2®® swap operations, 
and about 2®® bytes of memory. At the end, we would sweep through the 2®® 
192-bit blocks computed from three ciphertexts under each key, and look for 
duplicates. We would not expect to see any duplicates (though it wouldn’t be 
totally surprising to see them) unless there is a pair of nearly-equivalent keys. 
Any duplicates that came either from the same key, or from keys with the same 
A value would simply be ignored. 

In practice, we have limited memory resources, and so we consider time- 
memory tradeoffs. 
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The time-memory tradeoffs available here can be summarized as follows] 

Memory Work 

(bytes) (DES encryptions) Updated 



3 


X 


269 


2113 


3 


X 


261 


2121 


3 


X 


253 


2129 


3 


X 


245 


2137 


3 


X 


237 


2145 



5.2 Extracting the Rest of the Key 

Once we know iiy, Ry, we can mount the same kind of attack to get Rq, i?g. We 
have then peeled off the last two rounds, and have a six-round cipher remaining 
to attack. (In the case of DEAL- 192, we have a four-round cipher remaining to 
attack.) In the case of DEAL-256, knowing Rq and Rj allows us to find K^. In the 
case of DEAL-192, knowing i ?4 and R^ allows us to find K 2 - This leaves us with 
a 192-bit search to break DEAL-256, or a 128-bit search to break DEAL-192. 



5.3 Selecting the Keys 

Let K be the original key. Let Ki be the ith additional key requested. We request 
A keys such that: 

— Start with initial targeted key, K, and A active in parity bits only. 

— For z = 0 to 255, do 

• Let Aj — next delta active in parity bits only. 

• For j = 0 to 2^®, do 



K[0], = K[0] 

A'[l]i = K[l] 0 Random_Blockj 
K[2], = K[2] 0 Z\, 



i?2[*] = R2[j\ © A. 

by the birthday paradox. So, we will have A pairs of keys to test, of which we 
expect one pair to be nearly-equi valent. 

^ These computational cost estimates assume memory available with no additional 
costs for random accesses. If the attack were implemented with tape memory, for 
example, then the actual time taken for the attack would go up substantially. 
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5.4 The Full Attack 

The full attack is thus carried out as follows: 

1. We request 2^^ related keys according to the pattern described above. We 
expect one pair of these keys to be nearly-equi valent, but we don’t yet know 
which pair. 

2. We request the same three chosen plaintexts to be encrypted under each 
key. (We don’t have to be able to choose anything about them, but the same 
three plaintexts must be encrypted under each key.) 

3. We apply our test to the whole set of ciphertexts from the related keys. Given 
3 X 2^® bytes of memory, we will have to carry out 2^^^ DES encyptions. 

4. Let K, K* be the pair of nearly-equivalent keys, which we have now detected. 
In detecting the property, we have learned the last round’s subkey. We now 
apply the same meet-in-the-middle attack to find the next-to-last round’s 
subkey. (In DEAL-256, this isRq; in DEAL-192, this is R^.) 

5. We may now either apply some other attack on the cipher with two fewer 
rounds, or we may use knowledge of the last two rounds’ subkeys to learn 
64 bits of the input key, and then brute-force the remaining key. 

6. Assuming we just brute-force the remaining key, the attacks on DEAL-192 
and DEAL-256 both require 2^^ related-key queries, the same three chosen- 
plaintexts requested under each key, and 3 x 2®^ bytes of memory. The attack 
on DEAL-192 then requires 2^^® work, and the attack on DEAL-256 requires 
2^®® work. 

7. There may be improved attacks that exploit weaknesses in four- or six- 
round DEAL once we have discovered the last two round keys. For example. 
Biham’s attack against Ladder-DES can also be applied to DEAL-192, once 
the last two rounds have been peeled off. 

6 Conclusions 

In this paper, we have demonstrated a weakness in the key schedule of DEAL, 
leading to both equivalent keys and vulnerability to related-key attacks. While 
the related-key attacks are of primarily academic interest (requiring 2^®® DEAL 
encryptions worth of work for the cheapest attack), the equivalent keys are of 
immediate interest for anyone using DEAL in certain hashing modes. The im- 
portant lessons we draw from this analysis are: 

1 . Simply using a cryptographic primitive in a reasonable-looking way to design 
a key schedule does not guarantee resistance to attacks on the key schedule. 

2. In the specific case of DEAL, ignoring the parity bits of the keys sent in 
allowed nearly-equivalent keys to be found. A special class of keys were 
then found for which, instead of nearly-equivalent keys, these keys would 
be equivalent. Had those bits been immediately used, our attacks would not 
work. 
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Unfortunately, we don’t have a general design principle we can pull out of this 
analysis; designing key schedules is hard, and there aren’t any sure-fire shortcuts. 
This is borne out by the long list of AES candidates cryptanalyzed based on their 
key schedules which appears in the introduction. 

6.1 Open Questions 

A number of questions are raised by this research: 

1. Are there key schedules we can build from cryptographic mechanisms that 
are provably secure against various forms of attack? 

2. In the absence of these, can we at least find some useful design principles for 
cryptographic key schedules? 

3. Are there similar attacks on other cryptographic key schedules, e.g., those 
of Khufu, Blowfish, and SEAL? 
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Abstract. Interpolation attack was presented by Jakobsen and Knud- 
sen at FSE’97. Interpolation attack is effective against ciphers that have 
a certain algebraic structure like the VIAIZE cipher which is a prototype 
cipher, but it is difficult to apply the attack to real-world ciphers. This 
difficulty is due to the difficulty of deriving a low degree polynomial rela- 
tion between ciphertexts and plaintexts. In other words, it is difficult to 
evaluate the security against interpolation attack. This paper generalizes 
the interpolation attack. The generalization makes easier to evaluate the 
security against interpolation attack. We call the generalized interpola- 
tion attack linear sum attack. We present an algorithm that efficiently 
evaluates the security of byte-oriented ciphers against linear sum attack. 
Moreover, we show the relationship between linear sum attack and higher 
order differential attack. In addition, we show the security of CRYPTON, 
E2, and RIJNDAEL against linear sum attack using the algorithm. 



1 Introduction 



Interpolation attack Q was presented for attacking the VUTZE cipher 
though the VUTZE cipher is provably secure against differential crypt- 

analysis Q] and linear cryptanalysis Q. The basic idea of interpolation attack is 
as follows: First, the attack focuses on the algebraic structure in the cipher. Next, 
the attack tries to express ciphertexts using a polynomial of a plaintext. The ap- 
plicability of the attack is determined by the degree of the polynomial above, 
more precisely, by the number of the unknown coefficients of the polynomial. 

It is easy to find the degree of the polynomial for the VUTZE cipher since the 
non-linear operation in the VUTZE cipher is only a cubic operation in GF(2"). 
However, it is basically difficult to find the degree for real-world ciphers. We know 
of only two examples of successful interpolation attacks against ciphers. One is 
an attack Q] on a modified version of SHARK Q. The other is an attack on 
SNAKE Q]. The non-linear operation of both ciphers is an inversion in GF(2”), 
which is also a simple operation in GF(2”). On the other hand, nobody knows 
a cipher which is provably secure against interpolation attack. 

First, this paper introduces the concept of linear sum attack, a generalization 
of interpolation attack. Introducing linear sum attack leads to a clear vista on 
studying the security against interpolation attack. Next, the paper proposes an 
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effective algorithm which judges whether linear sum attack is applicable or not 
for a given cipher. Moreover, we show a relationship between linear sum attack 
and higher order differential attack provable security against linear sum 

attack implies provable security against higher order differential attack. Finally, 
we evaluate the security of CRYPTON Q, E2 and RIJNDAEL Q against 
linear sum attack using the security evaluation algorithm for linear sum attack. 



2 Preliminaries 

2.1 Notations and Analysis Target 

This paper studies the following situation. Let p be a plaintext and c be a 
ciphertext. Let c = Ek{p) be a block cipher whose block is n-bits long with a 
product structure. The encryption key k is in the set K (= {fci, ^ 2 , ■ ■ ■ , fc^}). 
Ek{p) consists of R round functions Fj,(r) (r = 1, 2, . . . , ii) as follows: 



Ek{p) = (EkCR) o Ffc(H-i) o • • • o Ffc{i))(p) , 



where is the rth round subkey, generated from fc by a key scheduling algo- 
rithm. 

We define c as the input of the last round, 

C = = Ek{p) = {Ek(.R-D O Ffc(B- 2 ) o • • • o Ffc{i))(p) . 

Moreover, we consider the following maps used in the interpolation attack (see 

Fig.O 



p : GF{q) GF(2)" 
c' : GF(2)" ^ GF{q) , 

where GF(( 7 ) is a finite field that contains q elements. This paper considers 
interpolation attacks using polynomials in GF(q). Note that we do not assume 
that g is a power of 2 and p and c' are bijective. 



plaintext: GF(2)" ^ GF(2)" GF(2)": ciphertext 

p T f c' 

GF(g) GF(g) 



Fig. 1. Attack diagram 
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2.2 Interpolation Attack 

Although several types of interpolation attack are known, this section describes 
the basic interpolation attack. If the reader is not familiar with interpolation 
attack, please refer to Q. 

An outline of the attack is as follows. 

Preparation: Find p and c! that satisfy 

c'(iffc(p(a;))) = fk{,x) G GF( 9 )[a;] , 

by analyzing the target cipher. Let N be the number of the unknown coef- 
ficients of the polynomial fk{x). 

Attack: 

Step 1: Obtain A -|- 1 ciphertexts c = Ek{p{x)) that are derived from the 
chosen plaintexts p(a;) (x G GF(g)). 

Step 2: Guess using exhaustive search. 

2 - 1 : Galculate c = F~f^)(c) from obtained c and guessed to 
decrypt 1 round. 

2-2: Find fk{x) from N pairs of (x, c'(c)) using polynomial interpo- 
lation. 

2-3: Verify the correctness of fk{x) derived in Step^^using a pair 
of {x, c'(c)) not used in Step^H 

This interpolation attack can be applied if N < q holds. However, it is very 
difficult to estimate N precisely for a real-world cipher. We give an answer to 
solve the problem in the following sections. 

3 Linear Sum Attack 

Gonsider the interpolation attack replacing the polynomial interpolation with 
Gaussian elimination in Step ^3 described in Sect. ^3 In this case, we can 
attack a cipher in the same way even if fk{x) is represented by a linear sum of 
linearly independent polynomials bi{x) G GF(( 7 )[a;] as in 

fk{x) = ai{k)bi{x) {ai{k) e GF{q)) . 

i=l 

We call this attack the linear sum attack. 

The attack succeeds if the number of unknown Oi(fc)s is less than q. We 
estimate the worst case complexity. The number of chosen plaintexts is at most q. 
The attack requires Gaussian eliminations corresponding to all possible values 
of k^^\ It is well known that Gaussian elimination requires 0{q^) arithmetic 
operations in GF(q). So, the attack requires 0{Lq^) arithmetic operations in 
GF(( 7 ) and L evaluations of 

Linear sum attack is equivalent to interpolation attack, if bi{x) = x’’~^ holds, 
that is bi(x) is a monomial. Gonsider the case of fk(x) = g(k) ■ + 2g{k) ■ 1, 
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for example. If we apply interpolation attack described in Sect. we need 3 
chosen plaintexts since the number of unknown coefficients is 2, which are g{k) 
and 2g(k). On the other hand, applying linear sum attack, we can factorize the 
polynomial to fk{x) = g{k) ■ {x + 2). This means that we need only 2 chosen 
plaintexts, since the number of unknown coefficients is 1, which is g{k). As 
shown by this example, linear sum attack requires less or equal number of chosen 
plaintexts than interpolation attack. 



4 Search for Effective Basis 



This section discusses how to find an effective basis {bi{x) , b 2 {x ) , . . . , bq{x)} for 
linear sum attack. Linear sum attack requires a basis while interpolation attack 
requires a polynomial expression of ciphertexts, where we regard a plaintext as 
a variable for interpolation attack. This section introduces an effective search 
algorithm for finding an effective basis. 

We focus on the following properties of GF(( 7 ). 

1. Any function over GF(q) can be expressed by a polynomial over GF(q). 

2. The set of all functions over GF{q) is a g-dimensional vector space, GF{q) 

3. Any polynomial over GF{q) can be expressed by a linear sum of a basis 
{bi{x),b 2 {x), . ..,bq{x)}, where bi{x) S GF( 9 )[a;] {i = 1,2, . . .,g). 

Using the above facts, we developed an algorithm for finding a basis {bi{x) , b 2 {x) , 
...,bq{x)} so that fk{x) = c'{Ek{p{x))) has the fewest unknown coefficients 
when fk(x) is expressed by a linear sum using the basis. 

Assume that fk{x) is expressed as 

fk{x) = ai{k)bi{x) (ai{k) e GF{q)) . 

i=l 



The smallest number of unknown coefficients we want to find is 



N = rank 



ai(fci) a2{ki) ■ ■ ■ aq{ki) 
oi(fc 2 ) 02(^2) ■ • ■ aq{k 2 ) 



ai{kL) a2{kL) ■ ■ ■ aq{kL) 



It is practically impossible to calculate the rank described above for all bases 
and for all keys fci, fc 2 , ■ ■ ■ , fci, since the complexity exceeds an exhaustive search 
for a key. We solve the problem by the following theorems. 



Theorem 1. The expectation of d is less than q + 2, where d is defined as 

dimGF(9) {vi,V2, ...,Vd) = q , 

for randomly chosen Vi in the q-dimensional vector space over GF(q). 



^ For simple description, we use GF(g)[a;] for GF(g)[a;]/(a;'^ — x). 
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Proof. Since a randomly chosen element in the g-dimensional vector space over 
GF{q) is contained in a particular z-dimensional (z < q) subspace with probabil- 
q" 1 

ity — , we need to choose, on average, r elements in order to find one that 

y qq’ 1-4 

q1 

is not in the subspace. Thus, the expectation of d can be evaluated as follows. 

9-1 9-1 i 9 , 



1 - -21 
i=0 qi i=0 

< (7 + 2 



2 - 

'•g' 

q-1 



Theorem 2. q + r (r > 0) randomly chosen vectors in the q-dimensional 
vector space over GF(( 7 ) span at least the {q — 1)- dimensional subspace with 
probability at most q~^ . 

Proof. 



Pr [dimGF( 9 ) (fl, U2, ■ ■ ■ , Vg+r) < g - 1] 

q-1 

= V Pr [dimGF(9) (fl,f2, ■ ■ •,t’9+r) = t] 
i=0 

Since the dimension of the vector space (zii, zi2, ■ ■ ■ , Vg+r) is z, we can choose z 
vectors which span the z-dimensional vector space from {zii, zi2, ■ ■ ■ , f9-i-r}- We 
assume dimGF(9) {v\,V2, ■ ■ ■ ,Vi) = i without loss of generality. 



9-1 



V Pr [dimGF(9) (W1,W2, ■ ■ ■,t’9-Hr) = i] 

< ^ ?Ji .?Jo ?J„ I „ ' ' 



z=0 

q-1 



Vi,V2,...,Vq+. 



= 'y] Pr [{Vi+i,Vi+2,...,Vg+r} C {vi,V2,...,Vi}] 

• * 7H 7)0 7)_ I „ 



2=0 
2=0 



Vi,V2,...,Vq+. 

g+r-( 2 +l) + l 



q-1 






-{q-i)(q+r-i) 



i=0 



< q . g-(9-(9-l))(9+r-(9-l)) 



= q 



□ 
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Corollary 3. q + r (r > 0) randomly chosen vectors in the q-dimensional 
vector space over GF(q) span the q-dimensional subspace with probability at 
least 1 — q~'' . 

Assume that fk{x) is random in GF(( 7 )[a;] if we randomly choose k. Then, 
according to Theorem Jand Corollary^ it is sufficient to calculate N, i.e. the 
rank, using g + 2 randomly chosen keys k. 

Thus, we can find the basis for the smallest number of coefficients with prob- 
ability at least 1 — q~^ by calculating 







a2(fcij • 




N = rank 




a2(fcij • 


aq{ki2) 






■ Oq(ki^^.^ 



where {ki^,ki^, ...,ki^j^^} is a random subset of K and ai{ki.),a 2 {ki^), ..., 
o^q{kij) {j = 1, 2, . . . , g -I- 2) are coefficients of the polynomial basis {1, a;, • • • , 
derived by some polynomial interpolation algorithm. Since a rank is an 
invariable with different bases, it is sufficient to consider only the polynomial 
basis. 

We summarize the basis search algorithm. 

Algorithm 4. 

Step 1; Choose appropriate parameters for the attack: 

— a finite field GF(g) 

— a map p : GF{q) GF(2)" 

— a map i! : GF(2)” ^ GF{q) 

Step 2: Generate q -\-2 randomly chosen keys ki^, ki^, . . . , ki^^^ G K. 

Step 3; Calculate all input-output pairs of fki. (= c' o Ek^. o p), 

{x, fki. (x)) 

for all X G GF(q) and 1 < Vj < q -|- 2. 

Step 4; Using some polynomial interpolation algorithm, determine coefficients 

ai{ky),a2{ki^), ■ . 

Q 

of polynomial fk^. (x) = for q-\-2 keys kf. (I < j < q^2) 

1^1 

using the input-output pairs of fk^^ calculated in Step^ 

Step 5; Calculate the number of effective coefficients 

ai(fcij 02(fciJ ••• aq{kif) 

A = rank 02 (^ 12 ) a,q{ki^) 

using Gaussian elimination. 
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A proper program for Gaussian elimination to calculate the rank can also find 
the effective basis for an attack. 

A cipher is secure against linear sum attack if N equals q. In other words, 
linear sum attack is effective if N is less than q. 

We studied the complexity of the above algorithm. Note that in Step J we 
can interpolate polynomials for each kij by calculating only 1 Gaussian elim- 
ination, which requires 0{q^) arithmetic operations in GF{q). The algorithm 
requires O(g^) {={q -I- 2) x 0{q^) + 0{q)) arithmetic operations in GF(q), with 
the assumption that the encryption time is much less than Gaussian elimination. 
Thus, it is sufficient for recent computers to calculate fV if g « 2®. 



5 Experimental Results 

This section evaluates the security of GRYPTON, E2, and RIJNDAEL, using 
Algorithm^ GRYPTON, E2, and RIJNDAEL have 12, 12, and 10 rounds, re- 
spectively, and the basic operations of these ciphers are 8 bits long. 

Unfortunately, since it is infeasible to check all combinations of GF(q), p, c', 
we ran the algorithm for only the following combinations. 

- GF{q) = GF(2®) 

zth 

- Pi : a; 1 -^ (0, . . . , 0, s , 0, . . . , 0) (i = 1, 2, . . . , 16) 

- c' : (a;i,a; 2 , . ..,xie) Xj {j = 1,2, . . ., 16) 

The results are summarized in Tabled We evaluated only the 128-bit key 
versions of the cipherij We count the number of rounds as 0 in the case of the 
cipher with only initial transformation. 



Table 1. Smallest number of unknown coefficients 



Number of Rounds GRYPTON E2 E2* RIJNDAEL 



0 


1 


1 


— 


1 


1 


1 


1 


0 


1 


2 


252 


1 


1 


255 


3 


255 


1 


1 


255 


> 4 


256 


256 256 


256 



*: without IT- and FT-Functions 



According to Tabled there are no long linear relation of these ciphers com- 
paring with the number of rounds of the specification of these ciphers. It seems 
that these ciphers are secure against generalized interpolation attack, linear sum 
attack. 

We evaluated only the 128-bit block length version of RIJNDAEL. 
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The goal of this paper is the security evaluation of a given cipher against 
linear sum attack. Thus, we do not go into the details of the attacks for these 
ciphers, however, a rough sketch of the attacks using Table J are shown in the 
Appendix. 



6 Relationship between Linear Sum Attack and Higher 
Order Differential Attack 

This section describes the strength of linear sum attack in comparison with 
higher order differential attack. 



Definition 5. Ek(jp) is secure against linear sum attack with respect to 
GF{q), p, and c'n N = q holds, where N is determined by Algorithm^^ 



Definition 6. Let el^\p) be the ith output bit of Ek{p), i-e., 

Ek{p) = {e[^\p),e‘'^\p),...,e''^\p)) . 

Let p be a map 

p : GF(2)* ^ GF(2)"; {xi, X2, ■ ■ ■ , Xt) ^ {pi,P2, ■ ■ ■ ,Pn) , 

, f if n~^{i) is defined . • ■ . 

where Pi = i , and tt zs an injective map from 

I constant otherwise 

{1,2, ...,t} to {1,2, . 

Ek{p) is secure against higher order differential attack with respect to p 
and u ei.“^(p(a;i, 2 ^ 2 , ■ ■ ■ , Xt)) = t holds. 

Note that Definition J does not consider improved higher order differential 
attacks such as proposed in Q and the case of f = n, and if a cipher is not secure 
against higher order differential attack according to Definition ^ we cannot 
conclude that the cipher is insecure against an actual higher order differential 
attack. 

The following theorem means that resistance against linear sum attack, which 
is a generalized interpolation attack, implies resistance against higher order dif- 
ferential attack. 
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Theorem 7. Let p be a map 

p : GF(2‘) ^ GF(2)"; (a;i, a; 2 , ■ • ■ , a;*) (pi,P 2 , • ■ • ,Pn) , 

, ( if TT~^ {i) is defined . ■ ■ r 

where Pi = < , . and tt is an injeetive map from 

constant otherwise 

{1, 2, . . . , t} to {1, 2, . . . , n}. Let c' be a map 

c' : GF(2)" ^ GF(2‘); (ci, cs, . . . , c„) ^ (yi, ys, . . . , y*) , 

where yi = Cr{i) and r is an injeetive map from {1, 2, . . . , f} to {1, 2, . . . , n}. 

For 1 < Vi < Ek{p) is secure against linear sum attack with respect to 
GF(2*), p, and i! Ek{p) is secure against higher order differential attack 
with respect to p and r(z). 

Note that we regard an element (oi, C 2 , . . . , a*) G GF(2)* as a G GF(2*) 
with GF(2) basis. 

Before proving Theorem | we show a well-known lemma. This lemma was in- 
troduced by ^3 Proposition 4, p.60], for example. 



Lemma 8. Let y = x'^ in GF(2‘) and regard (yi, y 2 , ■ ■ - ,yt) & GF(2)* as y 
with GF(2) basis. 

yi = wn{d) /or 1 < Vi < f 

holds, where x is regarded as {x\, X 2 , . . . , Xt) G GF(2)* with GF(2) basis and 
wn(d) is the Hamming weight of the binary representation of d. 

Proof (of Theorem^^. According to the assumption of the theorem and Defini- 
tionH Ek{p{x)) should be expressed as 

2 * 

Ek{p{x)) = '^ai{k)x'‘~^ , 

where afik) G GF(2*) is an unknown coefficient for 1 < Vi < 2*. Using LemmaH 

wh(x^) = tifd=2*-l 
< t otherwise 

holds. Since the degree-t term of Boolean representation of (p(a;)) comes 
only from and never comes from x'^ (d < 2* — 1), 



holds for 1 < Vi < t. 



□ 
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7 Conclusion 

This paper presented linear sum attack, which is a generalized form of interpo- 
lation attack, and presented an algorithm that efficiently evaluates the security 
of a cipher against linear sum attack. We applied the algorithm to 128-bit key 
CRYPTON, E2, and RIJNDAEL, which have 12, 12, and 10 rounds, respec- 
tively, and showed that the ciphers reduced to 3 rounds have non-trivial linear 
sum relations. Moreover, we showed that resistance against linear sum attack 
implies resistance against higher order differential attack. 

There are 2 open problems remaining. 

1. How to find effective GF(( 7 ), p, c'? 

2. How to construct a rational version of linear sum attack like interpolation 
attack? 
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Appendix: Linear Sum Attack of Reduced Round Variants 
of CRYPTON, E2, and RIJNDAEL 

We evaluate the security against linear sum attack for CRYPTON, E2, and 
RIJNDAEL using the results shown in Tabled Since these linear sum attacks 
are not superior than the known attacks against the ciphers and the attack 
procedures are almost the same as the interpolation attack described in Sect .^3 
we do not analyze and describe the details. 
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First, we consider CRYPTON and RIJNDAEL. Both ciphers are based on 
the same structure of Square Q, and there are 3-round linear sum relations 
with N < q. Applying the 3-round linear sum relation from the 2nd round to 
the 4th round, and guessing the 1st, the 5th, and the 6th round subkeys related 
to the linear sum relation exhaustively, we can attack the ciphers reduced to 6 
rounds faster than exhaustive search. The attack is almost the same as Square 
attack Q pp. 28-31]. 

Next, we consider E2. There exists 3-round linear sum relations with N < qin 
spite of the existence of IT- and FT-Functions. We can attack E2 with IT- and 
FT-Functions reduced to 3 rounds faster than exhaustive search, by applying 
the linear sum relation from the 1st round to the 3rd round, and guessing key 
bits used in FF-Function related to the linear sum relation. We can attack 
E2 without IT- and FF-Functions reduced to 5 rounds faster than exhaustive 
search, by applying the linear sum relation from the 2nd round to the 4th round, 
and guessing the 1st and the 5th round subkey bits related to the linear sum 
relation. 
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Abstract. In Q there is proposed an ElGamal-type cryptosystem 
based on non-maximal imaginary quadratic orders with trapdoor decryp- 
tion. The trapdoor information is the factorization of the non-fundamental 
discriminant Ap — A\p^ . The NIGE-cryptosystem (New Ideal Coset 
En-cryption) is an efficient variant thereof, which uses an ele- 

ment 0 *^ € Ker(<()^j ) C Cl(Ap), where k is random and 4>cl ■ Cl{Ap) — > 
Cl(Ai) is a map between the class groups of the non-maximal and max- 
imal order, to mask the message in the ElGamal cryptosystem. This 
mask simply ’’disappears” during decryption, which essentially consists 
of computing Thus NIGE features quadratic decryption time and 
hence is very well suited for applications in which a central server has 
to decrypt a large number of ciphertexts in a short time. In this work 
we will introduce an efficient batch decryption method for NIGE, which 
allows to speed up the decryption by about 30% for a batch size of 100 
messages. 

In there is proposed a NICE-Schnorr-type signature scheme. In this 
scheme one uses the group 'Ker{(p'pi[) instead of IPJ. Thus instead of 
modular arithmetic one would need to apply standard ideal arithmetic 
(multiply and reduce) using algorithms from Q for example. Because 
every group operation needs the application of the Extended Euclidean 
Algorithm the implementation would be very inefficient. Especially the 
signing process, which would typically be performed on a smartcard with 
limited computational power would be too slow to allow practical appli- 
cation. In this work we will introduce an entirely new arithmetic for 
elements in which uses the generator and ring-equivalence for 

exponentiation. Thus the signer essentially performs the exponentiation 
in (O/ij /pOzij )*, which turns out to be about twenty times as fast as 
conventional ideal arithmetic. Furthermore in Q it is shown, how one 
can further speed up this exponentiation by application of the Ghinese 
Remainder Theorem for (Oa^IpOa-^)*- With this arithmetic the signa- 
ture generation is about forty times as fast as with conventional ideal 
arithmetic and more than twice as fast as in the original Schnorr scheme 
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1 Introduction 

The utilization of imaginary quadratic class groups in cryptography is due to 
Buchmann and Williams who proposed a key agreement protocol analogue 
to Q based on class groups of imaginary quadratic fields, i.e. the class group 
of the maximal order. Since the computation of discrete logarithms in the class 
group of the imaginary quadratic number field is at least as difficult as factoring 
the corresponding discriminant (see these cryptosystems are very inter- 

esting from a theoretical point of view. In practice however these cryptosystems 
seemed to be less efficient than popular cryptosystems based on computing dis- 
crete logarithms in F*, like ' | or factoring integers, like Furthermore the 

computation of the group order, i.e. the class number, is in general almost as 
hard as computing discrete logarithms itself by application of the algorithm of 
Hafner / McCurley or more practical variants like BQ, which is subexpo- 
nential with L[^]. Hence it seemed to be impossible to set up signature schemes 
analogue to BQ or In however it was shown how the application of 
non-maximal imaginary quadratic orders may be used to construct an ElGamal- 
type cryptosystem with fast decryption and that it is in principle possible to set 
up ElGamal and RSA-type signature schemes. 

In there is proposed an ElGamal-type cryptosystem, later on called NIGE 
for New Ideal Coset Encryption Q, with very fast decryption. It was shown 
that the decryption process only needs quadratic time, which makes NIGE unique 
in this sense. First implementations show that the time for decryption is com- 
parable to the time for USA- encryption with e = 2^®. The central idea of this 
scheme is to use an element of the kernel Ker((/)^j^) of the surjective map 
4>q] : Cl{Ap) Cl{Ai) to mask the message in the ElGamal-type cryptosys- 
tem The map 4>q\ is induced by the isomorphic map : lAp{p) ^Ai{p) 
which maps O/ip-ideals which are prime to the conductor p to O/ij -ideals which 
are also prime to p. Hence this mask simply ’’disappears” during the trapdoor- 
decryption, which just consists of applying reducing the resulting ideal in 
the maximal order (and possibly going back to the non-maximal order using ip). 
The most time consuming step in the decryption is to compute the map 4>c]-i 
which is essentially the computation of a modular inverse (modulo p) using the 
Extended Euclidean Algorithm, which needs 0(log^(p)) bit operations. 

It is clear that because of this feature NIGE is very well suited for applications 
where a central server has to decrypt a large number of ciphertexts in a short 
time. Thus it is natural to search for an efficient batch decryption method. In 
Section B we will introduce a simple yet efficient method for batch decryption, 
which speeds up the system in this scenario even further. The timings in Section 
Bshow that it is possible to speed up the decryption process for 100 messages 
by about 30%. 

While the main application of the novel arithmetic for to be intro- 

duced in Section Bright be in the signing procedure of the NIGE-Schnorr-type 
signature scheme |Q, its development was actually motivated by cryptosystems 
based on totally non-maximal orders. Due to the very recent result how- 
ever, which reduces the DL-problem in these totally non-maximal orders to the 
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DL-problem in finite fields, these cryptosystems seem to have lost much of its 
attractiveness. 

In it was proposed to use totally non-maximal imaginary quadratic orders 
where Apq = Aip^q^ to set up RSA-type cryptosystems. Because one 
chooses Ai such that h{Ai) = 1 it is easy to compute h{Apq) = {p—{Ai/p)){q — 
(Ai/q)). It is clear that a similar strategy may be used to set up DSA analogues 
based on totally non-maximal imaginary quadratic orders. First implementations 
however have shown that these cryptosystems using standard ideal arithmetic 
are far to inefficient to be used in practice This lack of efficiency was the 
motivation for developing a more efficient arithmetic for Cl{Ap), or Ker((/)^j^) 
which is the same in the case of totally non-maximal orders. 

In Section^we will introduce this entirely new method for efficient exponen- 
tiation of elements in Ker((^^j^). Instead of using the standard ideal arithmetic 
(multiplication and reduction of ideals) in the non-maximal order we multiply 
and ’’reduce” the corresponding generators in the maximal order and later on 
lift the resulting principal ideal, which corresponds to the computed genera- 
tor, to the non-maximal order. Thus one essentially reduces the arithmetic in 
Ker((/)^j^) C Cl{Ap) to arithmetic in (O a^IvO which turns out to be much 
more efficient. 

The timings in Section^show that the naive variant of the new exponentia- 
tion technique, as proposed here, is already about twenty times as fast as classical 
ideal arithmetic. Very recently it was shown in that one can even do twice as 
good by utilizing the Chinese Remainder Theorem for {OaiIpOai)* ■ With this 
improvement the signature generation of the proposed NICE-Schnorr-variant is 
more than twice as efficient as in the original Schnorr-scheme 

This paper is organized as follows: In Sectionjwe will provide the necessary 
basics of imaginary quadratic orders. We will concentrate on the relation between 
the maximal and non-maximal orders and explain the structure of Ker{(j}'^]). In 
Section J we will briefly recall the NICE cryptosystem. In Section H we will 
introduce the new batch decryption for NICE and compare the running times of 
the implementation. The new exponentiation methods for elements in Ker((/)^j^) 
are explained in Section^ We will give the initially proposed method in Section 
^Jand outline the even more efficient CRT - variant from in Section^3 
Section^3we will also provide a timing comparison between the new methods, 
conventional ideal- and modular arithmetic. 

2 Imaginary Quadratic Orders 

The basic notions of imaginary quadratic number fields may be found in QQ 
or Q. For a more comprehensive treatment of the relationship between maximal 
and non-maximal orders we refer to Q or Q. 

Let Z\ = 0, 1 (mod 4) be a negative integer, which is not a square. The 
quadratic order of discriminant A is defined to be 



Oa = ^ + ujZ, 
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where 



UJ = 




1+Va 

2 



if Z\ = 0 



, if Z\ = 1 



(mod 4), 
(mod 4). 



( 1 ) 



The standard representation of some a G Oa is a = a; + yuj, where x,y G 

If A\ is squarefree, then O a^ is the maximal order of the quadratic number 
field (Q(-\/3 i) and Z\i is called a fundamental discriminant. The non-maximal 
order of conductor / > 1 with (non- fundamental) discriminant Af = Aif^ is 
denoted by OAf ■ In this work we will omit the subscripts to reference arbitrary 
(fundamental or non-fundamental) discriminants. Because (Q(V^i) = (Q(i/Z\/) 
we also omit the subscripts to reference the number field (Q(V^). The standard 
representation of an O/i-ideal is 



Zj={a,b), (2) 

where q G ® ^ ^>o, c = {iP' — A)/ (4a) G gcd{a, b,c) = 1 and —a < b < 
a. The norm of this ideal is Af(a) = aq^. An ideal is called primitive if g = 1. A 
primitive ideal is called reduced if |6| < a < c and 5 > 0, if a = c or |6| = a. It 
can be shown, that the norm of a reduced ideal a satisfies Af{a) < ^J\A\/‘i and 
conversely that if N{a) < ^J\A\/4: then the ideal a is reduced. We denote the 
reduction operator in the maximal order by pi() and write p/() for the reduction 
operator in the non-maximal order of conductor /. 

The group of invertible O/i-ideals is denoted by Xa- Two ideals a, b are 
equivalent, if there is a 7 e (Q(-\/A), such that a = 7b. This equivalence relation 
is denoted by a ~ b. The set of principal O/i-ideals, i.e. which are equivalent 
to Oa , are denoted by Va ■ The factor group Xa[P a is called the class group of 
Oa denoted by Cl{A). Cl{A) is a finite abelian group with neutral element Oa- 
Algorithms for the group operation (multiplication and reduction of ideals) can 
be found in Q. The order of the class group is called the class number of Oa 
and is denoted by h{A). 

Our cryptosystems make use of the relation between the maximal and non- 
maximal orders. Any non-maximal order may be represented as Oaj = Z + 
/Oai - If b{A) = 1 then OAf is called a totally non-maximal imaginary quadratic 
order of conductor /. An O/i-ideal a is called prime to /, if gcd{Af{a), /) = 1. 
It is well known, that all O/i^-ideals prime to the conductor are invertible. In 
every class there is an ideal which is prime to any given number. The algorithm 
FindIdealPrimeTo in ^3 will compute such an ideal. If we denote the (principal) 
C^zi/-ideals, which are prime to / by VAfif) and XAfif) respectively then there 
is an isomorphism 







Thus we may ’neglect’ the ideals which are not prime to the conductor, if we are 
only interested in the class group Cl{Af). There is an isomorphism between the 
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group of C>zij,-ideals which are prime to / and the group of O/ij-ideals, which 
are prime to /, denoted by 2ai (/) respectively: 

Proposition 1. Let Oaj be an order of eonduetor f in an imaginary quadratic 
field (Q(-\/3) with maximal order Oai ■ 

(i.) //2l G then a — OAf & d^Afif) and A/"(2t) = A/’(a). 

(ii.) If a G 1a f if), then%= aOAi GlAiif) and N {a) = N {%) . 

(iii.) The map : 21 e- > 2ln Oaj induces an isomorphism lAi{f)^lAf{f) ■ 

The inverse of this map is : a uOai ■ 

Proof: See B Proposition 7.20, page 144] . □ 

Thus we are able to switch to and from the maximal order. The algorithms 
GoToMaxOrder(a, /) to compute ip~^ and GoToNonMaxOrder(2l, /) to compute 
if respectively may be found in . 

It is important to note that the isomorphism tp is between the ideal groups 
lAi{f) and lAfif) and not the class groups. 

If, for 21, S G lAiif) we have 2t ~ S, it is not necessarily true that v^(2t) ~ 

Tm- 

On the other hand, equivalence does hold under ip More precisely we have 
the following: 

Proposition 2. The isomorphism induces a surjective homomorphism 

: Cl{Af) Cl{Ai), where a pi(:^“^(a)). 

Proof: This immediately follows from the short exact sequence: 

Cl{Af) — > Cl{Ai) — > 1 

(see Theorem 12.9, p. 82]). □ 

In the following we will study the kernel Ker{(j)f.^) of the above map (j)^] 
and hence the relation between a class in the maximal order and the associated 
classes in the non-maximal order in more detail. We start with yet another 
interpretation of the class group Cl{Af). 

Proposition 3. Let Oa; be an order of conductor f in a quadratic field. Then 
there are natural isomorphisms 

ci{Af) ^ ^ ^^"^^y'PA,,zzUY 

where Va^ ZZ^f) denotes the subgroup oflAi (/) generated by the principal ideals 
of the form uOai where a G Oai satisfies a = a (mod fOAi) for some a G Z 
such that gcd{a, /) = 1 . 

Proof: See B Proposition 7.22, page 145]. □ 
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The following corollary is an immediate consequence. 

Corollary 1. With notations as above we have the following isomorphism 

The next result explains the relation between Ker((/)^j^) and {OaiI /Oai)* ■ 

Lemma 1. The map {Oai/ JOai)* Ker((^^j^), where a ^p{aOA^) is a 
surjective homomorphism. 

Proof: This is shown in the more comprehensive proof of Theorem 7.24 in Q 
(page 147). □ 

Another immediate consequence of PropositionHallows to decide which prin- 
cipal ideals in the maximal order are mapped to principal ideals in the non- 
maximal order by applying cp: 

Corollary 2. Let a S Oai be an element of the maximal order and Oa; be the 
order of conductor f. Then ip {uOai) ~ Oa[ if and only if 

a = a (mod fOAi) 

with a ^ 7Z such that gcd{a, /) = 1 



Thus we are able to ’’model” the equivalence relation in the non- maximal 
order by considering generators of principal ideals in the maximal orders. This 
fact is called ring-equivalence. 

In Section H we will use the above results to formulate concrete algorithms 
for efficient exponentiation of elements in Ker((/)^j^). 

Finally, we will give the exact relationship between the class numbers h{Ai) 
and h{Af). 

Theorem 1. Let Oa^ be the order of conductor f in a quadratic field (Q('\/A) 
with maximal order Oai ■ Then 



HAf) 



h{A,)f 



p\f 




nh{Ai), 



where n G IN and j is the Kronecker-symbol. 

Proof: See Q Theorem 7.24, page 146]. □ 

Because 0*^_^ = ^*Ap — ^r Ap = Aip^, p prime and Z\i < —4 we have 

an immediate corollary of Theorem H 

Corollary 3. Let A\ < —4, Z\i = 0, 1 (mod 4) and p prime. Then h{Ap) = 
^(■^i) (p - and \Ker{(j)c])\ = (p ~ ’ where is the 

Kronecker-symbol. 



Thus we are able to control the order of the kernel and consequently set up 
a Schnorr analogue using the group Ker((/)^j^) instead of F* as proposed in 



Cryptosystems Based on Non-maximal Imaginary Qnadratic Orders 153 



3 The NICE Cryptosystem 



In this section we will briefly recall the setup of NICE. We refer to 
a more comprehensive treatment. 

Choose two primes p,q, p > 2y^ and set Ai — —q ii q = 3 



(mod 4), 



Ai = — 4g otherwise and Ap = Aip^. Then is a maximal order and Oa is a 
non-maximal order of conductor p. Note that by Lemma 8] all reduced Oai~ 
ideals are guaranteed to be prime to p, because p > -\/|Z\i|. Furthermore choose 
a reduced O/ip-ideal g S Ker((/)^j^). In there is given a simple algorithm 
which computes such a kernel element g. 

The secret key is just 



— the conductor p. 



The public key consists of 

— the non-fundamental discriminant Ap and 

— the ideal g. 

Because the system is entirely broken if one is able to factor Ap one should, 
as explained in least choose p,q > 

To encrypt a message 1 < m < i/|Z\i|/4 one proceeds as follows: 

1. Choose a random k £ Z with 1 < fc < 2®°. 

2. Compute the reduced O/ip-ideal t = Pp(fl^)- 

3. Embed the message m £ Z in & O/ip -Ideal m with Af(m) < \/l^i|/4. 

4. Compute the ciphertext c = pp(tnt). 

For the message embedding one may use the algorithm given in It is 
clear that the ideal t is simply used to ’’mask” the message in the ElGamal-type 
scheme. Furthermore note that k < 2®° can be chosen to be ’’unusually small”, 
because in contrast to the classical ElGamal cryptosystem the ciphertext consists 
of just one element and hence one would have to apply a brute force strategy to 
determine the message. It is just not possible to compute some discrete logarithm 
using more sophisticated e.g. (baby-step-giant-step) techniques if one is only 
given the cipher text. We refer to for a detailed treatment of this issue. 

To decrypt the ciphertext c one proceeds as follows: 

1. Compute £ = V^“^(c) using algorithm GotoMaxorder(c,p) from 

2. Reduce £, i.e. compute 9Jl = pi(£). 

3. Compute m = V5(®^) using algorithm GotoNonMaxorder(fOl,p) from 

Note that the computation in Step 1.-2. is just the computation of (pcl- 

The correctness of the decryption procedure is easy to see. Because g G 
Ker((/)^j^) we have 

(^“^(c) = (/3“^(tnt) = p~^{m){a)OAi — 9Jl{a)OAi ~ where a £ Oai- 

Because A/’(m) < y^|Z\i|/ 4 we know that m = (p(Wl) = (p(pi(C)) is a reduced 
Zip-ideal - the message-ideal m. 
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Note that if the message is embedded in the norm of the ideal m only, as 
proposed in then the step back to the non- maximal order (Step 3.) may be 
omitted, because we have A/’(m) = A/’(9H). 

For the readers convenience we will recall the algorithm GotoMaxOrder from 



Algorithm 2 (GoToMaxOrder) 

Input: A primitive OAp-ideal a = (a,b), the fundamental discriminant Ai 
and the conductor p 

Output: A primitive OAi-ideal 21 = = aOAi 

1 . <— Z\i (mod 2 ) 

2. Solve 1 = pip + Xa for pi, X G Z 

3. B ^ bpL + aboX (mod 2a) 

4. RETURN {a, B) 

4 Efficient Batch Decryption for NICE 

It is clear that because of its very fast decryption NICE is very well suited for 
applications in which a central server has to decrypt a large number of cipher- 
texts in a short time. Thus it is desireable to have an efficient batch decryption 
procedure at hand. In the following we will introduce a simple method which 
decrypts n ciphertexts Ci, 1 < z < n in one step, which turns out to be much 
faster than the sequential processing. 

If we have a closer look at the decryption procedure above we recognize that 
the most time consuming operation is the computation of GotoMaxOrder. This 
step is essentially the computation of a modular inverse modulo the conductor. 
Thus we can speed up the decryption process by applying a batch-gcd-strategy, 
like proposed in The central idea is to replace all but one costly inversions 
with the Extended Euclidean Algorithm by a few modular multiplications. 

If one is asked to compute bi = (mod p) and 62 = a^^ (mod p). Then 
instead of performing two inversions one can compute a = 0102 (mod p), b = 
a~^ (modp), bi = 602 (mod p) and 62 = bai (modp). Thus one replaces 
one inversion by three modular multiplications, which are usually faster, because 
in most implementations one inversion is ’’about” 15 modular multiplications. 

It is an easy matter to generalize this strategy to n inversions. This immedi- 
ately leads to the following algorithm for batch decryption, where we assume that 
the message is entirely encoded in the norm of the message-ideal, like proposed 
in Q. 

Algorithm 3 (NICE-Batch-Decryption) 

Input: n ciphertexts, i.e. reduced OAp-ideals Ci = (ai,bi), 1 < i < n, the 
fundamental discriminant Ai and the conductor p. 

Output: The n corresponding plaintexts, i.e. the norms Ai of the corre- 
sponding ideals ^Xfli = (Ai, Bi), for 1 < i < n. 

^ The author would like to thank V. Muller for pointing out the reference. 
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1. bo ^ (mod 2) 

2- 5o ^ 1 

3. gi ^ ai 

4. FOR i FROM 2 TO n DO gi <— gi-iai (mod p) 

5. Compute hn ^ g~^ (mod p) 

6. FOR i FROM n TO I DO 

6.1 Xi^hiPi-i (modp) 

6.2 hi-i ^ hiOi (mod p) 

6.3 p, ^ 

6.4 Bi ^ biPi + QiboXi (mod 2oi) 

6.5 Tli = (Ai,Bi) ^ pi{ai,Bi) 

7. RETURN n plaintexts Ai, 1 < i < n 

Thus instead of n inversions with the Extended Euclidean Algorithm we 
only have to perform one inversion, 3n — 3 modular multiplications, n integer 
multiplications and n integer divisions. Thus in typical implementations we are 
able to reduce the time for n decryptions, as shown in TableHbelow. 

The implementation was done using the LiDIA-package on a Pentium 133 
MHz choosing random primes p, q of the respective bit-length. The timings are 
given in microseconds, averaged over a number of 100 randomly chosen messages. 
The first row shows how many modular multiplications are as costly as one 
inversion in LiDIA. The next rows give the time for a NICE-encryption using 
SObit exponents and the binary, usual BGMW-, and the signed BGMW-method 
Q for exponentiation. This includes the time for the message-embedding. The 
last four rows give the decryption time (per message) for batch sizes of 1, 5, 10 
and 100 messages respectively. This shows that for a batch size of 100 we are 
able to speed up the decryption by about 30%. 



bit length p, q 


200 


300 


400 


500 


mult / inv 


13.9 


15.4 


16.2 


15.6 




ms 




ms 




ms 




ms 




NICE Enc. (binary) 


1861.7 


100 


4065.2 


100 


7368.9 


100 


12182.1 


100 


NICE Enc. (BGMW) 


669.7 


35.97 


1786.6 


43.95 


3556.5 


48.26 


6461.9 


53.04 


NICE Enc. (±-BGMW) 


640.9 


34.43 


1732.6 


42.62 


3493.6 


47.41 


6315.5 


51.84 


NICE Dec. (1 mess.) 


9.50 


100 


16.75 


100 


26.30 


100 


35.66 


100 


NICE Dec. (5 mess.) 


8.20 


86.32 


13.16 


78.57 


20.00 


76.05 


26.93 


75.52 


NICE Dec. (10 mess.) 


7.45 


78.42 


12.34 


73.67 


19.11 


72.66 


25.61 


71.82 


NICE Dec. (100 mess.) 


6.70 


70.53 


11.64 


69.49 


18.30 


69.58 


24.61 


69.01 



Table 1. Timings for NIGE with sequential and batch decryption 



5 Efficient Exponentiation for Elements of Ker(0(^/^) 

In this section we will introduce a novel arithmetic for classes in Ker((/)^)) which 
turns out to be much more efficient than standard ideal arithmetic. 
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Since we need to apply ip during our computation we will only consider 
ideals a which are prime to the conductor /. Thus if we are considering principal 
(integral) ideals aOAi, for some a £ Oai, then we require gcd(N(a), /) = 1. 
We start with providing the details of a naive generator arithmetic in Section 
While an exponentiation of an ideal using this arithmetic turns out to be 
about twenty times (for the Schnorr-scheme and thirteen times for the DSA- 
scheme in totally non-maximal orders) as fast as conventional ideal arithmetic, 
we can do even twice as good by applying CRT in {OaiI IOai)* as proposed 

is more than twice as fast as in the original scheme. 

5.1 Arithmetic in Ker(<^^^) Using 

While we already know from Lemmajthat the arithmetic in Ker((/)^j^) can be 
reduced to the arithmetic in (Oa^IJOa^Y-: ^ very elementary proof 

here, which ends up in a ’’ready to implement” algorithm. 

It is clear that all integral ideals a € Ker((f>^i) C Cl(Af) are of the form 

a=ip(aOAY, (4) 



in^J. For the readers convenience this method is briefly outlined in Section 
^H^Vith this arithmetic the signature generation in the Schnorr-analogue Q 



for some a £ Oai ■ 

Now instead of multiplying and reducing the ideals in the non-maximal order 
we will work with the generators which are corresponding to principal ideals in 
the maximal order. 

We will start with a simple lemma, which can easily be verified by straight- 
forward calculation. 

Lemma 2. Let ai = Xi + yitu £ Oai, Xi, yi £ Z, i G {1, 2} and uj like given in 
Then (3 = x + yuj = a\a 2 is given by 

4\i 

X = xiX2 + yiy2 — 

y = xiy2 + X2yi 

in the case that Ai = 0 (mod 4) and 

, /ii-1 

X = xiX2 + yiy2 — ^ — 

y = xiy2 + X2yi + ym 

if Ai = 1 (mod 4). 

Thus multiplying two generators ai is more efficient than multiplying the 
two ideals aiOAi, because no application of the costly Extended Euclidean Al- 
gorithm is necessary. 

It is clear however that we ’’somehow need to reduce” intermediate results 
during exponentiation to obtain a polynomial time algorithm. The central idea 



( 5 ) 

(6) 



( 7 ) 

( 8 ) 
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is to ’’model” reduction of ideals (in the non-maximal order) by manipulating 
the generator. This task will turn out to be surprisingly simple. 

The following lemma is immediate. 

Lemma 3. Let a = x + yoj, a' = x' + y'oj G Oai cind / G ^>i. Then a = a' 
(mod /Oai) if and only if x' = x (mod /) and y' = y (mod /). 

Next we will consider the norm of an element a G Oai under this congruence. 

Lemma 4. Leta,j3 G Oai and f G ^>i. Ifa = f3 (mod fOAi) thenN{a) = 
N{f3) (mod /). 

Proof: Let a = x + yuj. Then by LemmaHabove we have P = x' + y'oj, where 
x' = X (mod /) and y' = y (mod /). 

Then we have 



N{a) = x^ — y^of^ 

= x'^ — y'‘^uP (mod /) 

= Af(/3). 



□ 

The following corollary is immediate. 

Corollary 4. Let a,P G Oa^, f G ^>i and a = P (mod fOAi)- gcd{Af{a), 
f) = I if and only if gcd{Af{P), /) = 1 . 

Lemma 5. Let a,P G Oai sueh that gcd(A/’(a), /) = gcd(A/"(/3), /) = 1 and 
(fi as defined in Proposition^^ Furthermore let 7 = a/3 (mod fOAi)- Then 
gcd(A/"( 7 ), /) = 1 and if a = P (mod fOAi) then ^p{aOA^) ~ ‘fiPOAi) in 
Cl{Af). 

Proof: That gcd(Af( 7 ), /) = 1 is immediate by the multiplicativity of the norm 
and CorollaryH 

Because a = P (mod /Czii) it follows, that a = pS for some S G (Q(\/^), 
where (5=1 (mod fOAi)- Thus by Proposition H we know that (p{60ai) ~ 
OAf and hence the assertion follows. □ 

Furthermore we need the following result, which is immediate because ip is 
an isomorphism. 

Lemma 6. Let a G Oai, such that gcd(A/"(a), f) = 1, n G Z and ip as defined 
in Proposition^ Then we have ip(aOAi)'^ = 7 ’(<a"Ozii)- 

By combining the above results we immediately obtain the following. 

Lemma 7. Let a G Oai, such that gcd(Af(a), /) = 1, n G Z and ip as de- 
fined in Proposition^^ Then we have ip{aOAi)^ ~ t{iOai) for some 7 = 0 " 
(mod fOAi)- 

The following lemma follows immediately from and LemmaH 
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Lemma 8. Let ai = Xi + yiUj € O^i, Xi, yt G i G {1, 2}, lj like given in Q) 
and / > 1. Then (3 = x + yoj = a\a 2 (mod /Oai) is given by 



X = X 1 X 2 -G yiy 2 ^ (mod/) 


(9) 


y = xiy 2 -Gx 2 yi (mod/) 


(10) 


in the ease that A\ = 0 (mod 4) and 




X = X 1 X 2 + yiy2^^ ^ (mod /) 


(11) 


y = xiy 2 -G X 2 yi + yiy 2 (mod /) 


(12) 



if Ai = 1 (mod 4). 

This result enables us to ’’model” the conventional ideal arithmetic (mul- 
tiplication and reduction) by simple calculations modulo /. This leads to the 
following algorithm for exponentiation, which is based on binary method for 
exponentiation. We denote the binary length of n by A(n) = [log 2 (n)J -|- 1. 

Algorithm 4 (Gen-Exp) 

Input: a = X yto G Oai, the conduetor f such that gcd(Af(a), /) = 1 and 
the exponent n G . 

Output: a = (a, h) = p/((/?((aC>zi J”)). 

1. IFn = Q THEN 0UTPUT{1, Ai (mod 2)) 

2. IE n < 0 THEN n < n, y < y 

3. I <— A(n) — 1, {ni . . . 710)2 <— binary expansion of n, i.e. ni = 1 

4. Xh ^ X (mod /) 

5- yh ^ y (mod /) 

6. IE Ai = 0 (mod 4) THEN D ^ Ai/4 ELSE D ^ {Ai - 1) /4 

7. FOR i = l-l DOWNTO 0 DO 

7.1 h ^ Xh 

7.2 Xh^ h'^ -G ylD (mod /) 

7.3 IFAi = 0 (mod 4) THENyh ^ 2hyn (mod /) ELSE yu ^ 2hyh+yl 
(mod /) 

7.4 IFni = l THEN 

7.4.1 h ^ Xfi 

7.4.2 Xh ^ hx -\- ytyD (mod /) 

7.4.3 IF Ai = 0 (mod 4) THEN yn hy + xyt (mod /) 

ELSE yh hy-\- xyn -G yhy (mod /) 

8 . /* Compute the standard representation 2t = d{a, b) = auOAi */ 

8.1 /* Use .form */ 

Xh ^ 2X}i 

IF A\ = 1 (mod 4) THEN Xh ^ Xh -G yn 

8.2 Compute d ^ gcd{yh,{xh yhAi)/2) = \yu + p.{xh + yhAi)/2, for 
X, y G Z 

8.3 A^\xl-A,yl\/{M^) 

8.4 B ^ {Xxh -G y{xh -G yh)Ai(2)fd (mod 2A) 
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9. /* Lift 2t' = (l/d)2t to the non-maximal order and reduce it */ 
b ^ Bf (mod 2A) 

{a,b) ^ Pf {A,b) 

10. OUTPUT(a,b) 

Proof: By Lemma H we only have to compute 7 = 0 ” (mod fOAi)- The 
correctness of the exponentiation algorithm is immediate because it is the well 
known binary method with the operation given in Lemmajas group operation. 

In step I we simply compute the standard representation of the ideal 21 = 
OihOAi = d{a!Z + (5+ j22Z'). By Corollary^ we know that MipthOA^ ) = 

ad'^ is prime to /. This clearly implies that gcd(d, /) = 1. Because 21 = (d)2l' 
for d G 21' = + (5 + yfAi)/2Z we know from Proposition Q that 

V?(2l) ~ V^(2l'). Finally it is clear that we can apply (p from Proposition^ because 
gcd(a, /) = 1 □ 



5.2 Even More Efficient Arithmetic in Ker(<^j^^) Using CRT in 
(OaJpOa,)* 

In the previous section we saw that the arithmetic in Ker((()^j^) can be reduced 
to arithmetic in {Oai/ /Oai)*, which turns out to be much more efficient. In 
this section we outline yet another method for a further speed up. We refer to 
for the details. 

We will only concentrate on a special case which seems to be most important 
for practical application, as it is used in the Schnorr-analogue from That is 

we assume that the conductor is a prime p, where = 1 - 

Lemma 9. Let Oai be the maximal order and p be prime. Then there is an 
isomorphism 

{OaJpOa.T 

where (/(A)) is the ideal generated by f{X) G lFp[A] and 



f{X) 



if = 0 (mod 4), 
A2-A+i^,ifZ\i = l (mod 4). 



(13) 



Proof: See Proposition 5]. □ 

Theorem 5. Assume that = 1 o-nd the roots p, p G IFp of f{X) G IFp[A] 

as given in ^3 known. Then the following isomorphism can be computed 
in time 0 ((logp)^): 

{OaJpOaX 

Proof: From LemmaH'^e know that there is an isomorphic map {O aJpO Ai)* — *■ 
Fp[A]/(/(A)), where /(A) G IFp[A] is given in And that this isomorphism 
is trivial to compute. 

Because = 1 the polynomial /(A) is not irreducible, but can be de- 

composed as /(A) = (A — p)(A — p) G lFp[A] where p, p G Fp are the roots 
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of f{X). Thus if Z\i = 0 (mod 4) and D = z4i/4 we have p € Fp such that 
= D (mod p) and p = —p. In the other case Z\i = 1 (mod 4) we have 
p = (1 + b)/2, where 5^ = Ai (mod p) and p = (1 — b)/2 G Fp. Thus we have 
the isomorphisms 

(Oa/pO^,)- = 

Let a = a + buj G (OaiIpOai)* then the mapping tp : (OaiIpOai)* 

Fp (g) Fp is given as Xi = pJila) = a + bp G F* and X2 = ^’2(0;) = a + bp G F*. 
The inverse map ip~^ is computed by solving the small system of linear equations. 
I.e. one will recover a,b G F* by computing b — and a = xi — bp. Thus 

both transformations tp and ip~^ need time 0((logp)^). □ 

With this result we immediately obtain the of the following algorithm. 



Algorithm 6 (Gen-CRT) 

Input: a = X + yuj G Oai, the conductor p, such that gcd(Af(a),p) = 1, 
= 1, the roots p,pG F* of f(X) as given in ^3 and the exponent n G 2Z . 
Output: a = (a, b) = pp{ip{{aO A iT)) ■ 



1. IF n = Q THEN 0UTPUT{1, Ai (mod 2)) 



2. 


IF 


n < 0 THEN 


n < n, 


3. 


Xl 


^ {x + pyT 


(mod p) 


4. 


X 2 


^ \x + pyT 


(mod p) 


5. 


r <- 


- {p- p)~^ 


(mod p) 


6. 


yh 


^ {x 2 - xi)r 


(mod p) 


7. 


Xh 


^ Xl - yhP 


(mod p) 



8. Compute standard representation, lift and reduce as in AlgorithnfBStep^^^^ 

9. OUTPUT(a,b) 

Note that the computation of r in StepBcan be done in a precomputation 
phase, as is it independent of the current a. 



5.3 Timings for Different Arithmetics 

In this section we will give the timinings of a first implementation of the novel 
arithmetics for Ker{(pf,^). We will also include timings for standard-ideal arith- 
metic and modular arithmetic to allow comparison. 

For the RSA analogues in totally non-maximal orders Q we fixed Ai — 
— 163 and chose a random exponent k < n = pq. For all DL-based systems 
(DSA and Schnorr) we chose a random k < 2^^^. For the DSA-analogue based 
on totally non-maximal orders we also fixed A\ = —163. Note that due to the 
recent result Q this analogue with Ap is only as secure as the original scheme 
with p. Thus one needs to compare the lines for the 1200 bit DSA-analogue in 
Cl{Ap) with the time for 600 bit modular arithmetic. 

For the NICE-Schnorr-analogue ^3 we also chose a random k < 2^®° and 
Ap = Aip^ where Ai = —q (or Z\i = —Aq ii q = 1 (mod 4) respectively) 
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and p, q with equal bitlength. Because factoring integers is about as hard as the 
computation of discrete logarithms (modulo p) one needs to compare the timings 
where Ap and the prime modulus have the same bitlength. 

The timings are given in microseconds on a pentium 133 MHz using the 
LiDIA - package One should note that the implementation of neither variant 

is optimized. This is no problem, because we are interested in the comparison, 
rather than the absolute timings. 



cryptosystem 






Schnorr / DSA 






RSA 




arithmetic 


mod. 


ideal 


Gen- exp 


Gen- exp 


Gen-CRT 


mod. 


ideal 


Gen-exp 


bitlength of 


P 


Ap 


Ap = -163p-‘ 


Ap = -qp‘‘ 


II 

1 


n = pq 


n = pq 


n = pq 


600 


188 


3182 


240 


159 


83 


258 


10490 


994 


800 


302 


4978 


368 


234 


123 


583 


22381 


2053 


1000 


447 


7349 


542 


340 


183 


886 


35231 


3110 


1200 


644 


9984 


724 


465 


249 


1771 


68150 


6087 


1600 


1063 


15751 


1156 


748 


409 


3146 


125330 


10864 


2000 


1454 


22868 


1694 


1018 


563 


5284 


224799 


18067 



Table 2. Timings for exponentiation with different arithmetics 



The timings in Tabled show the impressive improvement. One can see that the 
exponentiation using Algorithm^is already about thirteen times as fast as an 
exponentiation using conventional ideal arithmetic, if Ap = — 163p^ and more 
than twenty times as fast for the Schnorr-analogue. 

If we apply Algorithm^as proposed in and outlined in Section^J we are 
about forty times as fast as conventional ideal arithmetic. Using this arithmetic 
the signature generation in the NICE-Schnorr-analogue is more than twice as 
fast as in the original scheme in IF*. 

On the other side we see that the RSA-analogue totally non-maximal 

orders is still far less efficient than the original scheme and although immune 
against low exponent and chosen ciphertext attack not preferable for practice. 

Finally one should note that for the signature verification in the NICE- 
Schnorr-scheme one has to use standard ideal arithmetic, which is very inef- 
ficient. Thus an important task for the future will be to speed up the standard- 
ideal arithmetic as well, to enable practical application of the proposed Schnorr- 
analogue 13 . 
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Abstract. In ^ Lim and Lee present an algorithm for fast exponentia- 
tion in a given group which is optimized for a limited amount of storage. 
The algorithm uses one precomputation for several computations in or- 
der to minimize the average time needed for one exponentiation. This 
paper generalizes the previous work proposing several improvements and 
a method for fast precomputation. The basic Lim/Lee algorithm is im- 
proved by determining the optimal segmentation of the exponent. Fi- 
nally, it is shown that the improved Lim/Lee algorithm is faster than 
the previous one in average case. 



1 Introduction 

Modular exponentiation is a basic operation widely used in cryptography. In 
many cryptographic protocols users must perform one or more exponentiations 
in a given group. Well known examples are encryption, decryption and signa- 
tures with RSA 0, signature generation and identification as in Digital Signa- 
ture Standard (DSS) 0, Brickell/McCurley Schnorr Q, and many other 
schemes. The exponentiation can be decomposed into a large number of mul- 
tiplications, so it is an operation which is heavily computational, consumes a 
lot of time and constitutes a computational bottleneck in many protocols. The 
efficiency of most public-key crypto systems mainly depends on the speed of the 
exponentiation algorithm. 

Classical algorithms for exponentiation are the binary algorithm (known as 
the square-and-multiply method, Q) and the signed binary algorithm y. Other 
algorithms use some amount of storage for intermediate values in order to im- 
prove the performance. Examples are the windowing method and algorithms 
based on addition chains In Q Lim and Lee present a new exponentiation 

Howard Heys and Carlisle Adams (Eds.): SAC’99, LNCS 1758, pp. 2000. 
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algorithm based on precomputations. The goal of this algorithm is to achieve a 
minimal number of operations (squarings and multiplications) for an exponen- 
tiation under the condition of limited amount of storage. The required number 
of operations for the precomputation has thereby not been considered. 

The Lim/Lee algorithm optimizes the evaluation of the exponentiation 5® 
in a given group (usually Z^, N being a large prime or a product of two large 
primes) in a case when the base g is fixed and the exponent e is randomly chosen. 
The fixed base g allows the usage of a precomputation table in order to reduce 
the number of computations required, but the algorithm has an additional cost 
of storage for the precomputed values. Such an algorithm that is independent of 
e, but depends on g is suitable for use in most discrete logarithm based protocols 
for signature generation and identification (e.g. 

In this paper we present a generalization and several improvements of the 
exponentiation algorithm of Lim/Lee. We focus our observations on the speed 
of the algorithm without concerning the storage costs for the precomputed el- 
ements, and then compare it’s behaviour for limited storage. Furthermore, we 
present an efhcient algorithm for precomputation and therewith optimize the 
total number of operations for a given exponentiation, i.e. the number of opera- 
tions for the precomputation and for several computations based on it. Finally, 
both algorithms are compared for variable length of the exponent. 

The Lim/Lee algorithm is described in section J and our improvements are 
presented in section^ A new precomputation algorithm is proposed in section^ 
Some comparisons of the variants of the Lim/Lee algorithm in a case when 
unlimited storage is available with the windowing method are given in section | 
and the behaviour of the algorithm under the condition of limited storage is 
presented in section ^ Finally, section | concludes the paper. 

2 The Lim/Lee Algorithm 

In this section the Lim/Lee exponentiation algorithm is briefly presented with 
a slightly changed terminology. In the next section we present and discuss the 
improved and extended algorithm. 

In order to compute the exponentiation g® with the Lim/Lee algorithm, the 
Lbit exponent e is divided into h blocks Cj, each with length a = |"^]. The 
exponent e can be written as 



h-l 




( 1 ) 



Each of the blocks is further subdivided into v smaller blocks of size ^ = |"^] 
and each block can be represented as 



V—1 




( 2 ) 
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Each block consists of b bits and can be represented as 

b-l 

^i,j — ^i,j ,b—2 • • • ^i,j ,0 — ^ ^ • (3) 

k=0 

It is further assumed that the number of blocks h and v are chosen in a way that 
Ch-i and ei,„_i,0 < i < h — 1 are not equal to zero. The segmentation of the 
exponent e is shown in figure^ Based on the length I of the exponent and on the 
parameters h and v, the precomputation leads to the array G[j] [u] with 0 < j < u 
and 1 < u < 2^. Employing the binary representation Uh-iUh -2 ■ ■ - UiUq of u 
and Ti = , the array is defined by the following equations: 



G[0][u] 


Uh-1 Uh-2 Ui Un 


(4) 


G[j][u] 


= {G[j-l][u])^" =G[0][uf" V l<j<v 


(5) 


Using the definition 


h-l 








(6) 



i=0 



the exponentiation can be described with the following algorithm: 

1. SET R=l. 

2. FOR k = h-l DOWNTO 0 

(a) SET R = R^. 

(b) FOR j = v-l DOWNTO 0 

i. SET R = R-G[j][/,.fc]. 

3. RETURN R. 



We denote the considered algorithm as 1. Lim/Lee algorithm. This algorithm 
needs 5—1 squaring J and — 1 multiplications in average, but a — 1 

multiplications in the worst case. Thus, the average number of operations needed 
to perform a single exponentiation with this algorithm is 

2 ^ — 1 

G Lim! Lee^l Q: (5 1) bL 1. (7) 

In Q the exponent is represented either as in figure^” ™ several rows each of 
V blocks - or as shown in figure with a shortened last row. The length of the 
blocks by partitioning as in figureH(we denote it as 2. Lim/Lee algorithm) is 



52 

5i 



I 

(5 - l)u + Vlast 

l-b 2 -h- Vlast 

(5 - l)(u - Vlast) 



(8) 



(9) 
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a 



Fig. 1. Partitioning an l-hit exponent e in h rows with the same length according 
to the 1 . Lim/Lee algorithm 




Fig. 2. Partitioning of a l-bit-exponent e with shortened last row according to 
the 2 . Lim/Lee algorithm 



The computational cost in this case consists of 62 — 1 squarings, bi {v — viast ) + 
^2 • viast — 1 multiplications in the maximum and ‘^^rrrbi{v — viast) + ^ 2 fe ^ ^2 • 
viast — 1 multiplications in the average. Thus, the average number of operation 



^ Many squaring algorithms are faster than an ordinary multiplication, by making 
use of the fact that both multiplicands are equal. On the other hand, multiplication 
is performed with a constant multiplicand and - after performing some precom- 
putations - may also be faster than a squaring. Here a denotes the ratio of the 
computational complexity of the algorithms for squaring and multiplying. 
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needed for one exponentiation is 

2?t-i _i 2^ — 1 

^Li7njLee,2 ~ (62 f) “t” ^h—1 '^last) “t" ‘ '^last 1- (10) 

3 The Improved Algorithm 

When using the algorithms described in the last section, the following problems 
arise: 

— Two algorithms are presented by Lim and Lee without a statement which 
algorithm should be preferred in a given case. 

— It is not examined how to find the optimal choice of the parameters h and v 
resp. h, V and viast', exhaustive search with three parameters is very expen- 
sive. 

In order to find appropriate solutions for this problem we examine the algo- 
rithms described in the previous section in some detail. If we allow the parti- 
tioning of the exponent with viast > 0 in a rectangular form as a special case of 
the second algorithm, the result in case of same partitioning (hi = h.2 — 1 and 
vi = V2) is 61 = 6 = 62 and thus CLim/Lee,i ^ C'Lim/Lee.2- K is obvious that in 
case of Viast = 0 the second algorithm can not be better than the first one. 

It is also shown in section Hthat the second algorithm is not worse than the 
first one in any point. This can also be seen from the results of the numerical 
tests for all lengths of the exponent I in the range up to 512 bit. 

Since the second algorithm has been identified to be the better one, we further 
try to decrease the number of basic parameters in this algorithm. The existence 
of three basic parameters (h, v and viast) makes the exhaustive search expendable 
and slow. A fundamental advantage can be achieved in the case if a and b are 
used as basic parameters instead of h and v. We derive the parameters ^ = |"^] 
and 'i’ = 1"^] from the basic parameters a and 6, and the partitioning of the 
exponent that results from this determination of the basic parametar is given 
in figure Q We denote the partitioning the exponent in this way as 3. Lim/Lee 
algorithm. The number of bits in the last row is now only aiast = I — o,{h — 1), 
and they are divided into viast = [ blocks. The number of bits in the last 
block is blast = fliast ~ b(viast ~ !)• With this new way of partitioning of the 
exponent we can achieve computational cost of 

2'* - 1 

^ Lim/ Lee, 3 Ot{b 1) T U 1 (H) 

as the average number of operations for one exponentiation. Although this for- 
mula is the same as Q, the parameters a, 6, and h can have values different 
from those in ^ due to the different segmentation. It is shown in section^that 
the computational cost for this algorithm is never higher than the computational 
cost for each of both basic algorithms. 
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a 



eo 






eo,j 



eo,o 



6i 



Gh-l 



Bi- 



Bi,0 






eji-1,0 



O'last 



Fig. 3. Partitioning of a ^-bit exponent e according to the extended, 3. Lim/Lee 
algorithm 



A further improvement results from the observation of the last row in figureH 
The term — 1 in equation^Jresults from the fact that the multiplication 

in line 2.b.i in the algorithm description in section^is trivial with probability 
2“^. Concerning the fact that the last row must not be filled completely we 
can use the term (a — ojast) + — 1 instead (we denote it as 4. 

Lim/Lee algorithm). This improvement does not concern the algorithm itself, 
but only the specification of its average number of operation, and leads to 

<2h—i ^ 2 ^ \ 

^LimlLeeA — Ol{h 1 ) H ^last) H Oblast 1 

as the average number of operations for one exponentiaton. 

4 Precomputation 

In all variations of the Lim/Lee algorithm analyzed above only the number of 
operations for the computation is concerned, assuming that the precomputation 
has already been performed. The computational cost for the precomputation of 
the array G[j] [u] has not been considered at all and an algorithm for the precom- 
putation of the array G[j\ [u] is not given in Proposals for the precomputation 
algorithm can be found in other works (e.g. ^ p. 626]), but they are not efficient. 
Since the precomputation makes a considerable fraction of the total number of 
operations, we present an efficient algorithm for the precomputation consisting 
of two steps: 
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1. Since we have the array G[j][u] for u = 2®, 0 < z < h and 0 < j < t; in 
form , the appropriate values can be computed with repeated squarings 
beginning with r. Always when the exponent matches to the exponent in Q 
and/or Q, the corresponding value is assigned to the array element. The 
number of squarings needed for this step is 

a{h - 1) + h {viast - ■ (13) 

2. The remaining elements can be computed directly from the last step with 
one multiplication for each case. From — 1 rows in the upper half, 
h — 1 rows have already been computed in the first step, whereas only one 
from the 2^“^ rows from the lower half is already finished. The number of 
multiplications needed for this step is 

-h)+viast{2'^-^ -1). (14) 

Thus, the total number of operations needed for the precomputation is 

P = a {a{h -l)+b {viast - 1)) + iz - h) + viast - l) . (15) 

An example for precomputation with partitioning of the exponent with h = S, 
z; = 3, Viast = 2 is given in figure J The values computed with squarings in the 
first step are marked with S, and the multiplications in the second step are 
marked with M. The results of the precomputations and computations with 




j = 2 j =1 j = 0 




i = 2 < 



precomputed array G[j'][m] 



u = 1 


S 


s 


s 


u = 2 


S 


s 


s 


u = 3 


M 


M 


M 


u = 4 




S 


S 


u = 5 




M 


M 


u = 6 




M 


M 


u = 7 




M 


M 




i = 2 


i = 1 


i = o 



Fig. 4. Precomputation of the array in the Lim/Lee algorithm {h = 3, 

= 3 and viast = 2) 

M: Multiplication 
S: Squaring 
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various parameters and partitioning for one exponent with length ^ = 15 are 
shown in table J Using exhaustive search, the parameters shown in the table 
have been found out as optimal for this case. The computational cost is 



O 



Limj Lee,x 



P+z-C 



Limj Lee, X 



(16) 



where z denotes the number of computations based on a single precomputation, 
and a; = 1, 2, 3, 4. 



Table 1. Comparison of the Lim/Lee algorithms using the example I = 15, a = 1 
and z = 1 

C: Number of operations needed for the computation 
P: Number of operations needed for the precomputation 
O: Total number of operations (precomputation and computation) 





Algorithm 1 


Algorithm 2 


Algorithm 3 


Algorithm 4 


h 


1 


2 


2 


2 


V 


1 


2 


2 


2 






1 


1 


1 


a 


15 




8 


8 


bi 




5 






b resp. 62 


15 


5 


7 


7 


C 


20.5 


9.25 


11 


10.75 


P 


0 


11 


9 


9 


0 


20.5 


20.25 


20 


19.75 



5 Optimization of Precomputation and Computation for 
(Nearly) Unlimited Memory 

The variations of the Lim/Lee algorithm described in the sections and | are 
analyzed and compared in this section for exponents up to 512 bit length. A 
comparison with the windowing exponentiation algorithm is also given. The 
efficiency of the algorithms is measured by the average number of operations, 
where the multiplications and the squarings are treated equally (a = 1). 

The optimal parameters for each algorithm have been determined with ex- 
haustive search. The results for the range I < 192 are shown in figure J The 
improvements compared to the first Lim/Lee algorithm are shown in per cents. 

An exact comparison of the required number of operations in one exponen- 
tiation (precomputation and computation) shows that for all exponent lengths 
I and a single exponentiation (z = 1) the following inequality 

O Lim/ Lee,l — ^Lim/ Lee,2 ^ ^ Lim/ Lee,3 — ^ Lira/ Lee,A ( 1 "^) 

is satisfied. This means that the fourth Lim/Lee algorithm has the lowest com- 
putational cost, and we set 

O Lim/ Lee ^ Lim/ Lee^4' 



(18) 
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Fig. 5. Comparison of the variants of the Lim/Lee algorithm 



In the range of I > 192 that is not presented, the algorithms 1-3 behave in 
the same way and only the fourth algorithm has small improvements. 

We now compare the windowing exponentiation algorithm with the Lim/Lee 
algorithm. In figure H we can see the average number of operations, whereby 
the average number of operations needed for one exponentiation consist of one 
precomputation and z computations. 

First, for a given length of the exponent I and given number of exponentia- 
tions z, the optimal values of the parameters (fc for the windowing algorithm, a 
and b for the Lim/Lee algorithm) are determined with exhaustive search. Then 
the total number of operations is determined as a sum of the number of opera- 
tions for the precomputation and number of operations for z computations. We 
get the average number of operations dividing the total number of operations 
by z, in order to get comparable results. 

If only a single exponentiation is performed, the windowing algorithm leads 
to the lowest number of operations. In a case when several exponentiations which 
base on the same precomputation need to be performed, and the exponents have 
the same length I, the Lim/Lee algorithm has the best performance. 

6 Optimization of the Computation for Limited Memory 

The results presented in the last section concern the case when the amount of 
storage needed for the intermediate results of the precomputation is practically 
unlimited. This condition is satisfied when the exponentiation is performed on a 
PC or workstation. But the case when only limited storage and processing power 
are available must also be concerned, since many of the cryptographic protocols 
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Fig. 6. Comparison of the exponentiation algorithms 

O'. Average number of operations needed for a single exponentiation 



using exponentiation (signature generation and verification schemes) are per- 
formed on smart cards. The precomputation table in the Lim/Lee algorithm can 
reduce the number of multiplications required at the expense of storage for the 
precomputed values. 

The numerical results for the time-memory tradeoffs for exponent lengths 
of 160 and 512 bit are summarized in the t ablest and H for various variants 
of the Lim/Lee algorithm and optimal partitioning of the exponent in each 
case. The same examples as in | are considered and it is assumed that the 
squarings and the multiplications have the same computational cost (a = 1). 
The improved Lim/Lee algorithm is usually faster and never slower than 
the others in average case. We can see that if more storage is available, the 
algorithm becomes faster by the decreasing number of multiplications required. 
The compromise by limited storage is a slower algorithm due to the increasing 
number of operations required. An exponentiation with 160 bit exponent can 
be performed with only 19.96 multiplications in average if 2299 intermediate 
values from the precomputation are stored. If only 10 values can be stored, the 
same exponentiation requires 82 multiplications. So, by known storage capacity 
of a smart card, we can estimate how fast the given exponentiation can be and 



vice versa. 
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Table 2. Exponentiations with 160 bit {AC: Average case, WC: Worst case) 
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1 Algorithm 4 | 
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94 


80/27 
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94 


14 
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79 
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156 


103/52 


149.8 


153 


103/52 


149.7 


153 


93 


5/3 


132.8 


136 


6/3/0 


134.7 


138 


103/35 


132.8 


136 


103/35 


132.7 


136 


125 


5/4 


123.8 


127 


6/3/1 


123.5 


126 


103/26 


123.8 


127 


96/32 


123.5 


126 


157 


5/5 


118.8 


122 


6/3/2 


117.2 


119 


90/31 


117.6 


119 


90/31 


117.2 


119 


189 


6/3 


111.7 


113 


7/3/0 


112.6 


114 


86/29 


111.7 


113 


86/29 


111.6 


113 


252 


6/4 


104.7 


106 


6/5/3 


106.1 


108 


86/22 


104.7 


106 


86/22 


104.6 


106 


317 


6/5 


100.7 


102 


7/3/2 


100.2 


101 


82/21 


100.4 


101 


82/21 


99.88 


101 


381 


7/3 


96.42 


97 


7/4/2 


97.06 


98 


79/20 


96.38 


97 


79/20 


96.06 


97 


508 


7/4 


90.42 


91 


7/5/3 


91.16 


92 


74/19 


90.42 


91 


74/19 


90.38 


91 


635 


7/5 


86.42 


87 


8/5/0 


87.41 


88 


74/15 


86.42 


87 


74/15 


86.38 


87 


892 


7/7 


82.42 


83 


8/4/3 


80.68 


81 


66/17 


80.74 


81 


66/17 


80.68 


81 


1020 


8/4 


77.75 


78 


9/4/0 


77.75 


78 


64/16 


77.75 


78 


64/16 


77.75 


78 


1275 


8/5 


74.75 


75 


9/5/0 


75.75 


76 


64/13 


74.75 


75 


64/13 


74.75 


75 


1530 


8/6 


72.75 


73 


8/7/5 


73.68 


74 


64/11 


72.75 


73 


64/11 


72.75 


73 


2040 


8/8 


69.75 


70 


9/8/0 


69.75 


70 


64/8 


69.75 


70 


64/8 


69.75 


70 


2555 


9/5 


66.89 


67 


9/6/4 


67.84 


68 


59/10 


66.88 


67 


59/10 


66.85 


67 


3066 


9/6 


64.89 


65 


9/8/4 


65.83 


66 


57/10 


64.89 


65 


57/10 


64.89 


65 


4089 


9/8 


62.89 


63 


10/7/1 


61.90 


62 


56/8 


61.95 


62 


56/8 


61.90 


62 


5626 


9/10 


60.89 


61 


10/6/5 


58.94 


59 


52/9 


58.95 


59 


52/9 


58.94 


59 


7672 


10/7 


57.95 


58 


10/8/7 


56.95 


57 


53/6 


56.95 


57 


53/6 


56.93 


57 


10234 


10/9 


55.95 


56 


11/6/4 


53.97 


54 


48/8 


53.98 


54 


48/8 


53.97 


54 


13305 


11/6 


52.98 


53 


11/7/6 


51.97 


52 


47/7 


51.98 


52 


47/7 


51.97 


52 
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7 Conclusions 

We generalize the exponentiation algorithm Q by making several improvements 
in the computations and in the precomputation achieving a decrease of the com- 
putational cost for a single exponentiation (precomputation and computation). 
For the computations, the partitioning of the exponent is modified in a way 
which reduces the number of multiplications. Furthermore, we propose a new 
efficient method of precomputation for the Lim/Lee algorithm, minimizing the 
total time needed for a single exponentiation. We compare the exponentiation 
algorithms, showing that if several exponentiations based on the same precom- 
putation are performed, the Lim/Lee algorithm has the best performance. The 
fact that time consuming precomputations have to be done limits the applicabil- 
ity of the algorithm to those cryptosystems where the same base is used often, 
which holds for most discrete logarithm based systems. Although only slight im- 
provements are made, these are quite remarkable because of the importance of 
exponentiation. The exponentiation can be additionally speeded up by applying 
parallel processing and is applicable to various computing environments due to 
its wide range of time-storage trade-offs. 
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Abstract. This paper investigates software optimization of special mnl- 
tiplication. In particular we concentrate on ax + 6 mod 2®^ + 13 mod 2®^ 
which is the bottleneck operation in the DFC cipher. We show that we 
can take advantage of the language and architecture properties in order 
to get efficient implementations. 

In this paper we use the ANSI C and the Java languages. We also inves- 
tigate assembly code, and data structure alternatives. Finally, we show 
that we can also use floating point arithmetic. 



1 Introduction 

Several cryptographic algorithms require that some particular multiplication is 
optimized. In particular, the DFC AES candidate [1] was believed to be sub- 
stantially slower than the others because its main operation ax + b mod 2®^ -|- 
13 mod 2®4 was believed to be necessarily slow. In this paper we concentrate 
on optimization techniques for this operation. The results are of course not 
restricted to DFC since other cryptographic algorithms use this kind of primi- 
tive. For instance, the MMH MAC algorithm [3] uses the X)?=i mod (2®^ -|- 
15) mod 2®^ function, the Shazam [5] algorithm uses the (x-h mod p mod n 
operation. 

We will first introduce how to do a division-less modular reduction. Then, we 
will see what are the choices to implement the multiplication and this modular 
reduction. We will point out some security issues about the implementation itself 
since there are many ways to implement it and see that most concerns can be 
solved using proper operations. We finally will show what are the best choices 
to get optimal performances with generic ANSI C, 64-bit C, assembly language 
and Java on Alpha, Pentium II, UltraSparc processors and on IA64 architecture. 

2 Calculation of ax + b mod 2®^ + 13 mod 2®^ 

The multi-precision multiplication is best implemented in a straightforward man- 
ner since optimizations such as Karatsuba do not seem worthwhile for such small 
operands. 

As a division is rather slow, we must have another method to do modular 
reduction. We will use the following method: 
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Let P = ax + b. Note that since a,x and b are 64-bit numbers, P is a 128-bit 
number {ax + b < 2^^® — 2®^). We first write 

p = g2®4 -b R 

where R is the remainder of the Euclidean division of P by 2®^. It follows 
that 

p = g(2®4 -b 13) -b p - isg 

and this is equal to P — ISg modulo 2®"^ -b 13. 

The subtraction is a problem since it can lead to two different cases: P — 
13g > 0 and P— 13g < 0. Dealing with negative values is tedious since we have 
to take care about timing attacks and use sometimes back and forth conversions 
between signed and unsigned integers (we want to use all register bits for multi- 
plication). This can be avoided while splitting the value into smaller words but, 
usually, it is not effective. 

As we are doing arithmetic modulo 2®^ -b 13, we can use the bitwise comple- 
ment to do subtraction: 



P' = P - 13g = P -b 13(2®4 - 1 - g) - 13(2®4 - 1) mod 2®"^ -b 13 
= P-bl3(2®4-l-g)-bl82 mod 2®-^ -b 13. 

2®^ — 1 — g is the 64-bit bitwise complement of Q. The result is always positive 
and is most of the time greater than 2®^ -b 13. Thus, we can perform a similar 
reduction for P': P' = g'2®^ -b P', and Q' < 14. Then, we can use a small table 
to compute the final values. 

3 Some Possibilities 

3.1 Multiplication 

Implementations should use arithmetic on numbers of the largest size that is 
efficiently available on the target processor. 

The key factor for speed is the multiplication of two 64-bit quantities yielding 
a 128-bit result. We have to do a number of multiplications which depends on 
the multiplier of the processor as shown in the table 1. 



Table 1. Number of multiplications required 
operands(bits) result(bits) 7^ multiplications 



64 


128 


1 


32 or 64 


64 


4 


16 or 32 


32 


16 


8 or 16 


16 


64 
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We also have to add the resulting values and the number of additions depends 
on the size of the registers used (cf table 2). 



Table 2. Number of additions required 

registers(bits) 7 ^ additions 
64 4 

32 18 

16 88 



The total cost of additions is typically less than the cost of the multiplications 
but is not negligible. Thus, optimizations of this part of the algorithm are of 
prime importance, especially when dealing with carries. 

It is sometimes worth doing more operations on smaller operands, since they 
may be faster. Tables 1 and 2 should give a fair estimation of the cycle counts. 

3.2 Modular Reduction 

Before the modular reduction, we need to add the constant b. This can be done 
either before or after the multiplication. The choice is generally made by the lan- 
guage used: a low-level language with “add with carry” implies usually a straight- 
forward implementation of the multiplication. Otherwise, adding the constant 
in the same time as we multiply can save some cycles. 

Then, we can implement the modular reduction in many different ways. First, 
we have to split the 128-bit result of the multiplication into many words. On the 
one hand, if we use words of the maximum size available, then we will not have 
any room left to store carries so that we will need to propagate them. On the 
other hand, using words of smaller size will imply more operations. Depending on 
the processor and its parallelization level, one solution is faster than the other. 

In the reduction itself, the multiplication by the constant 13, as explained in 
the previous section is usually optimized by the compiler. This is not the case on 
UltraSparc (Sun WC 5.0 compiler) or in Java; we have to do it manually with 
shifts and adds. The second step of the reduction implies another multiplication 
of a low operand. On most processors, this operation is faster using a small 
lookup table. Not only it is faster but it avoids some problems of timings attacks 
with such small operands. 

4 Security Issues 

Since the implementation of the algorithm can be done in many distinct ways, 
one has to take care about security issues of the implementation itself. 

First of all, the multipliers on some chips can compute the product of small 
operands in fewer cycles than for large operands. This feature may make timing 
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attacks possible, as is the case with most algorithms using integer multiplications 
(e.g. RC6, Mars). 

Recently, Harvey [2] has noticed that an attack of DFC can be made on care- 
less implementations. He supposes a different approach than the one we present 
here. Our implementations can be easily made resistant to timings attacks (ex- 
cept on the multiplier itself), and rare code paths are easy to test. 

Care must be taken with the propagation of carries. On some chips, the 
fastest implementation uses branches and thus is vulnerable to timing attacks. 
On low end processors, the cost of such a protection is noticeable since there are 
many carries to propagate. But on Pentium Pro for instance, the cost is only 
about 40 cycles. 

5 Dedicated Optimizations 
5.1 ANSI C 

Writing portable ANSI C code entails that you do not know anything about 
the representation of the objects. When you need a 32-bit unsigned integer, you 
have to use an unsigned long. 

So, using 32-bit integers for both multiplication and additions, an implemen- 
tation of the round function requires at least 16 multiplications and 18 additions. 

On most processors, this implementation will not have a high speed compared 
to an assembly coded function: the processor may be able to deal with larger 
registers or the processor may have some particular properties (e.g. a larger 
multiplier, or an add-with-carry opcode). Those characteristics can not be used 
in a portable ANSI C code. 

Still, an ANSI code should compile and produce the same results whichever 
system and (ANSI) compiler you use. The tradeoff between portability and effi- 
ciency is generally costly. 

As regards to the modular reduction, since we do not know anything about 
the processor, it does not make much sense to choose between alternatives. The 
addition of b can be made within the multiplication. We should use 32-bit words 
to do the modular reduction, since it would otherwise require too many opera- 
tions. 



5.2 C with 64-Bit Integers 

A new norm, called C9X, will allow to use 64-bit integers (such as in JAVA for 
instance) . Indeed, it will not only help 64-bit processors to give their full power, 
but also it will be helpful for 32-bit processors (such as Pentium II) which have 
a larger multiplier. 

Using 64-bit unsigned long long, one can use only 4 multiplications to 
compute the 128-bit multiplication of the round function. One should switch 
back to the 32-bit integers next, since other operations on 64-bit integers are 
emulated by the compiler. Thus they yield to poor performance. 




Software Optimization of Decorrelation Module 



179 



For native 64-bit processors, not only does the use of 64-bit types reduce the 
number of operations, but these operations are faster: 32-bit operations are often 
emulated by the processor (using integer masks for example) and are slower than 
64-bit opcodes. 

In this code, we can use 32-bit words, even on 64-bit processors, to do the 
modular reduction. With 32-bit processors, it is obvious. With 64-bit processors, 
many instructions are executed each cycle. So you may group into 32-bit words 
for free if it helps parallelization. The main advantage is that the propagation 
of carries is eased since we do them only once at the end of the computing (they 
are stored in the 32 most significant bits of the registers during the calculation) . 

5.3 Alpha 

The Alpha 21164 has a multiplier which takes two 64-bit inputs and provides 
the least significant or most significant half of the 128-bit result after 12 or 14 
cycles respectively. Also, the multiplications can overlap and can run in parallel 
with other operations. 

If we use portable ANSI C code, which only guarantees arithmetic up to 
32 bits, then performance is relatively poor on 64-bit chips. We have to break 
numbers into 32-bit pieces and cannot use any 64-bit capabilities, in particular 
fast multipliers such as that on the 21164. 

ANSI C permits implementations to provide 64-bit arithmetic however, and 
by taking advantage of this we gain a lot of speed. We still cannot get the most 
significant half of a 64 x 64-bit product however. On Alpha we use the assembly 
language instruction, umulh from C (by implementation-dependent methods) to 
attain the best performance on this architecture. 

Since the multiplication is atomic, the addition of 6 -I- 182 is made after it. 
We use 64-bit words, the first multiplication by 13 is done via shifts and the 
second one is left to the compiler (which implements it correctly). This can be 
done efficiently in C. 

More extensive use of assembly language does not appear to yield any signif- 
icant improvement. 

5.4 Pentium II 

The Pentium II is a 32-bit processor with a multiplier which takes two 32-bit 
inputs and returns the 64-bit result. The multiplication instruction is fast: it only 
takes 4 cycles. So, the entire multiplication and the modular reduction should 
be fast as well. 

However, the expected speed is not achievable in C. Using only ANSI C, we 
are not able to use the multiplication with 64-bit result. Even if we do use 64-bit 
integers via the long long type (which will be part of C9X), compiled code 
does not use addition with carry instruction and it does a lot of unnecessary 
data movement between registers and memory. 

To implement the integer multiplication in assembly, we take advantage of 
two operations: addition with carry and 32-by-32 multiplication. 
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The modular reduction is done with 32-bit words, and use a table lookup. 
Once again, we need the addition with carry, so that it is also done in assembly 
language. 

When the whole computation is done in registers using assembly language, 
we get the expected speed of the function. 

5.5 UltraSparc 

The UltraSparc is a 64-bit processor. Until recently, it could not be used as such 
in C since there was no compiler for 64-bit mode. Sun’s new C compiler 5.0 
handles 64-bit integers and performs well on 64-bit C code. 

The multiplier takes two 64-bit inputs and computes the least significant half 
of their product. Unlike the situation on Alpha, there is no method to get the 
most significant half. Thus, we have to do four multiplications to get the full 
result. These multiplications are rather slow: each of them takes about 20 cycles 
so that the time for the entire multiplication is large. The time for modular 
reduction is insignificant in comparison. 

In order to achieve better results, we can use the Floating Point Unit. The 
FPU has a double precision multiplier which we will use in a slightly unusual 
way. Since the FPU uses the IEEE 754 ^ representation of numbers, we can 
use 52 significant bits with double precision floating-point numbers. This means 
that we can multiply 24-bit values and add several of the results without any 
round-off error occurring. Thus, we can do a 64 x 64-bit product using nine 
24 X 24-bit multiplications (and some additions). Alternatively we can do eight 
16 X 32-bit multiplications. These methods are faster than using four integer 
multiplications. 

The only problem is to convert from integers to floats. When done with casts 
in C, it uses a function of the C library and is very slow. When done in assembler 
using the FiTOd instruction, it is not much better. For reasonable speed we have 
to do it manually via a bit mask and an addition. Similar tricks are used to 
convert from floating-point numbers back to integers in order to do the modular 
reduction. 

This enables us to achieve better performances for the multiplication than 
the standard 64-bit C code. 

The modular reduction is implemented with 32 bit words and carries are 
stored into high order bits of the registers. Multiplications by 13 are all made 
using shifts and additions. 



5.6 IA64 Architecture 

Intel has recently unveiled specifications of its next architecture, called IA64. 
This enables us to estimate the cycle counts of the Merced processor. 

The Merced is a 64-bit processor. We have no idea at the moment whether 
compilers will be able to deal with the full set of instructions so that we will 

^ ANSI/IEEE Standard 754-1985: Standard for Binary Floating Point Arithmetic 
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only consider assembly language. The key point is that IA64 has a full 64-bit 
to 128-bit unsigned integer multiply so it will end up in the fast category. The 
instruction xma is an integer a * x + b (which can never overflow 128 bits). 

Terje Mathisen has written an IA64 implementation of the round function 
and gets a round timing of 30 cycles with some more or less obvious possibilities 
to save a few more cycles. Of these 30 cycles, 10 are taken by a pair of sequential 
xma operation, but the second can be handled with integer code instead, since 
the multiplier is small and known (13). 

The one possible problem is that integer multiplication uses the floating point 
registers, so the (currently unknown but believed to be 1 cycle) time to convert 
back and forth between integers and floating point mantisses is in addition to 
the two xma.u operations needed for the low and high halves of the result. For 
best performance, all the integers and floating point constants need to be placed 
in registers before starting the inner loop. 

As predication replaces branches in the carry propagation, there should be 
no possibility for a timing attack based on key or data values. 

Thus, the 240 cycles count compares well to the 231 cycles on a 21264 Alpha. 
This answers to the criticism that the decorrelation module should be slow on 
Merced and shows that some non-trivial optimizations of the code can give a 
huge improvement of speed. We also note that the numerous registers and the 
multiplier should enable a fast RSA implementation. 

5.7 Java 

There is a very simple way to implement the multiplication and the modular 
reduction in JAVA, using Bigintegers. Unfortunately, this slows down the speed 
dramatically. It has the advantage of only taking two or three lines of code and 
can provide a reference but not an optimized implementation. 

Java provides a 64-bit integer data type, which is always signed. Anyway, the 
signed-ness can just be ignored in the arithmetic operations we need: additions, 
subtractions and multiplications are defined in the standard as if they were 
modulo 2®^ and bitwise logical operations use the sign bit as in normal twos- 
complement representation. There is an unsigned right shift operator, as well as 
a signed one. The only restriction is that we can neither use comparisons (which 
could have been useful to propagate carries) nor use signed divisions (which we 
do not need anyway). These characteristics are described in the Java Virtual 
Machine specifications [4]. 

Thus, the implementation is essentially the same as a 64-bit C version. Since 
we cannot do casts (even using “assembly” Java bytecode) and since we cannot 
use comparisons, the implementation is naturally resistant to endianess issues 
and to most timing attacks (except for the multiplications). 

Even if one could produce specific Java code for a processor, it does not 
gain very much. Optimized versions for Pentium II and UltraSparc uses the 
same tricks as optimized C codes. Most of the optimizations dedicated to the 
processors are done in JIT compilers. Even hand-written bytecode, which should 
produce faster code, does not have a noticeable speedup. The reason why we can 
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not do optimizations is that the set of opcodes is very small and the whole job 
is given to the JIT compiler which optimizes the code for the given processor. 
Since 64-bit operations are part of the language, they are well optimized by the 
compiler, which reorders and expands some instructions. 

Some errors should be avoided to help the compiler: splitting into functions 
as we can do in C, using a multiplication by 13 on UltraSparc processor. But 
the Java compilers are relatively new and improve quickly. 

With the development of Web services and online payment, these JAVA im- 
plementations become all the more important and huge improvements of speed 
have been done during the past year: JIT compilers and now HotSpot technology 
should give a speed equivalent to C-|— I- according to Sun Microsystems (that is 
roughly the speed of C in cryptography). 

6 Conclusion 

When going to results, there are two facts to outline: one should use the largest 
integers available and the round function is not always well optimized by com- 
pilers. 

On Alpha, on account of the lack of a C instruction to get the most significant 
half of the multiplication, 200 cycles are lost. The same problem may exist on 
IA64. On Pentium II, compilers do not achieve the expected speed because there 
are too few registers. Results on the UltraSparc are disappointing as a result of 
the slow multiplier. 

In the following table 3, we compare ANSI C portable code and 64-bit C 
code to the best implementation available in order to show the importance of 
the optimizations. 



Table 3. Number of cycles for DFC 



Processor JAVA ANSI C 64-bit C Best 



Alpha 21164 


n/a 


2562 


526 


310 (ASM) 


Pentium II 


1481 


2592 


1262 


392 (ASM) 


UltraSparc 


4087 


4160 


875 


775 (C with floats) 


Alpha 21264 


n/a 


n/a 


335 


231 (ASM) 


IA64 


n/a 


n/a 


n/a 


240 (estimated) 



With generic ANSI C code, DFC is one of the slowest AES candidates on all 
platforms. Using assembly language, it becomes the fastest on Alpha processors 
and among the fastest on Intel Pentium II and Merced processors. 

We have seen what are the best solutions on various microprocessors and 
languages. Harvey [2] thought that “correct implementations may be difficult to 
achieve” . We have shown that correct and fast implementations can be easily 
made. 
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As the cost of this decorrelation module is nearly the same as the cost of the 
multiplication, it could be used as a plug-in in many other algorithm without 
significant decrease of the performances. The drawback is that its optimization 
requires access to some low-level instructions (add with carry, most significant 
bits of the multiplication) which are generally not available in a high-level lan- 
guage such as C. 
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Abstract. Pseudonym systems allow users to interact with multiple or- 
ganizations anonymously, using pseudonyms. The pseudonyms cannot be 
linked, but are formed in such a way that a user can prove to one organi- 
zation a statement about his relationship with another. Such a statement 
is called a credential. Previous work in this area did not protect the sys- 
tem against dishonest users who collectively use their pseudonyms and 
credentials, i.e., share an identity. Previous practical schemes also relied 
very heavily on the involvement of a trusted center. In the present pa- 
per we give a formal definition of pseudonym systems where users are 
motivated not to share their identity, and in which the trusted center’s 
involvement is minimal. We give theoretical constructions for such sys- 
tems based on any one-way function. We also suggest an efficient and 
easy-to-implement practical scheme. 

Keywords: Anonymity, pseudonyms, nyms, credentials, unlinkability, 
credential transfer. 



1 Introduction 

Pseudonym systems were introduced by Chaum Q in 1985, as a way of allowing 
a user to work effectively, but anonymously, with multiple organizations. He sug- 
gests that each organization may know a user by a different pseudonym, or nym. 
These nyms are unlinkable: two organizations cannot combine their databases 
to build up a dossier on the user. Nonetheless, a user can obtain a credential 
from one organization using one of his nyms, and demonstrate possession of the 
credential to another organization, without revealing his first nym to the second 
organization. For example. Bob may get a credential asserting his good health 
from his doctor (who knows him by one nym), and show this to his insurance 
company (who knows him by another nym). 

Anonymity and pseudonymity are fascinating and challenging, both tech- 
nically — can we achieve them? — and socially — do we want them? We focus on 
technical feasibility, referring the reader in the social question to excellent recent 
treatments by Brin Q and Dyson Q. 

Chaum and Evertse develop a model for pseudonym systems, and present 
an RSA-based implementation. While pseudonyms are information-theoretically 
unlinkable, the scheme relies on a trusted center who must sign all credentials. 

Howard Heys and Carlisle Adams (Eds.): SAC’99, LNCS 1758, pp. 184-^^^ 2000. 

@ Springer-Verlag Berlin Heidelberg 2000 
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Damgard constructs a scheme based on multi-party computations and 
bit commitments that provably protects organizations from credential forgery 
by malicious users and the central authority, and protects the secrecy of the 
users’ identities information-theoretically. The central authority’s role is limited 
to ensuring that each pseudonym belongs to some valid user. 

Chen presents a discrete-logarithm-based scheme, where a trusted center 
has to validate all the pseudonyms, but does not participate in the credential 
transfer. Chen’s scheme relies very heavily on the honest behavior of the trusted 
center, because a malicious trusted center can also transfer credentials between 
users. 

These schemes have a common weakness: there is little to motivate or prevent 
a user from sharing his pseudonyms or credentials with other users. For example, 
a user may buy an on-line subscription, obtaining a credential asserting his 
subscription’s validity, and then share that credential with all of his friends. 
More serious examples (e.g. driver’s licenses) are easy to imagine. 

We base our proposed scheme on the presumption that each user has a master 
public key whose corresponding secret key the user is highly motivated to keep 
secret. This master key might be registered as his legal digital signature key, so 
that disclosure of his master secret key would allow others to forge signatures on 
important legal or financial documents in his name. Our proposed scheme then 
has the property that a user can not share a credential with a friend without 
sharing his master secret key with the friend, that is, without identity sharing. 
Tamper-resistant devices such as smartcards are not considered in this work. 

Basing security on the user’s motivation to preserve a high-value secret key 
has been proposed before, such as in Dwork et al.'s work on protecting digital 
content ^3 Goldreich et al.’s study of controlled self-delegation ^3- 
cent work, Canetti et al. 3 incorporated this notion into anonymous credential- 
granting schemes to prevent credential sharing among users. However, the model 
considered in their work differs considerably from our own: while we explore a 
whole system of organizations interacting with pseudonymous users, 3 assume 
that organizations only grant credentials to users who reveal their identity to 
them, though the credentials can then be used anonymously. The practical con- 
structions they give, while based on weaker assumptions than ours, are not 
applicable to our situation since they take crucial advantage of the fact that 
the credential granting organization knows the identity of the user it grants a 
credential to. 

In our model, a certification authority is needed only to enable a user to 
prove to an organization that his pseudonym actually corresponds to a master 
public key of a real user with some stake in the secrecy of the corresponding 
master secret key, such that the user can only share a credential issued to that 
pseudonym by sharing his master secret key. As long as the CA does not refuse 
service, a cheating CA can do no harm other than introduce invalid users into 
the system, i.e. users who have nothing to lose in the outside world. 

In our model, each user must first register with the CA, revealing his true 
identity and his master public key, and demonstrating possession of the corre- 
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spending master secret key. (Sometimes it is not required that a user should 
be motivated not to share his identity. In those cases, the CA is not needed 
altogether.) After registration, the user may open accounts with many different 
organizations using different, unlinkable pseudonyms. However, all pseudonyms 
are related to each other — there exists an identity extractor that can compute a 
user’s public and secret master keys given a rewindable user who can authenti- 
cate himself as the holder of the pseudonym. 

An organization may issue a credential to a user known by a pseudonym. A 
credential may be single-use (such as a prescription) or multiple-use (such as a 
driver’s license), and may also have an expiration date. Single-use credentials are 
similar to electronic coins, since they can only be used once in an anonymous 
transaction. Some electronic coin protocols protect against double-spending by 
violating the anonymity of double-spenders, but generally do not protect against 
transfer of the coin. A credential should be usable only by the user to whom it 
was issued. 

In section H we formally define our model of a pseudonym system. In sec- 
tion ^ we extend Damgard’s result and prove that a pseudonym system 
can be constructed from any one-way function. In sectionjwe give a practical 
construction of a pseudonym system based on standard number-theoretic as- 
sumptions and the hardness of a new Diffie-Hellman-like problem which 

we prove hard with respect to generic group algorithms. Our construction is 
easily implementable. Moreover, the secret key that motivates the user not to 
share his identity is usable in many existing practical encryption and signature 
schemes 



J. As a result, our system integrates well with existing tech- 
nology. Finally, we close by discussing some open problems. 



2 The Pseudonym Model 

2.1 Overview 

Informal Definitions. In a pseudonym system, users and organizations inter- 
act using procedures. We begin the discussion of the model by introducing the 
procedures. 

— Master key generation. This procedure generates master key pairs for users 
and organizations. A crucial assumption we make is that users are motivated 
to keep their master secret key secret. This assumption is justified, because 
master public/secret key pairs can correspond to those that the users form 
for signing legal documents or receiving encrypted data. A user, then, is an 
entity (a person, a group of people, a business, etc.) that holds a master 
secret key that corresponds to a master public key. 

— Registration with the certifieation authority. The certification authority (CA) 
is a special organization that knows each user’s identity, i.e. the master public 
key of the user. Its role is to guarantee that users have master public/secret 
key pairs that will be compromised if they cheat. The user’s nym with the 
CA is his master public key. The CA issues a credential to him that states 
that he is a valid user. 
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— Registration with an organization. A user contacts the organization and to- 
gether they compute a nym for the user. There exists an identity extractor 
which, given a rewindable user that can authenticate himself as the nym 
holder, extracts this user’s master public/secret key pair. Then the user 
demonstrates to the organization that he possesses a credential from the 
CA. 

— Issue of credentials. The user and the organization engage in an interactive 
protocol by which the user obtains a credential. 

— Transfer of credentials. A user who has a credential can prove this fact to any 
organization, without revealing any other information about himself. We call 
this operation “transfer” of a credential, because a credential is transferred 
from the user’s pseudonym with one organization, to his pseudonym with 
the other. 

We want to protect the system from two main types of attacks: 

— Credential forgery: Malicious users, possibly in coalition with other organi- 
zations including the CA, try to forge a credential for some user. 

— User identity compromise or pseudonym linking: Malicious organizations 
form a coalition to try to obtain information about a user’s identity, ei- 
ther by getting information about the user’s master public/secret key pair, 
or by identifying a pair of pseudonyms that belong to the same user. 

The main difference between our model of a pseudonym system and the previous 
models is that in our model the notion of a user is well-defined. In the treatment 
of Damg&d, a user is an entity who happens to be able to demonstrate the 
validity of a credential with the certification authority. Whether this credential 
was originally issued to the same entity, or to a different one who subsequently 
shared it, remains unclear and therefore such systems are liable to a credential 
forgery attack, namely credential forgery by sharing. 



2.2 The General Definitions 
Preliminaries. 

Let k be the security parameter, and let denote the unary string of length 
k. We use the terms such as Turing machine, interactive Turing machine, prob- 
abilistic Turing machine, polynomial-time Turing machine, secure interactive 
procedure, and rewindable access in a standard way defined in the literature ^3 
and in the full version of the present paper 



Procedures. 

Master Key Generation: 

Definition 1. Asymmetric key generation G is a probabilistic polynomial-time 
procedure which, on input 1^, generates master public/secret key pair {P, S) (no- 
tation (P, S) G G(l^) means that {P, S) were generated by running G) such that 



188 Anna Lysyanskaya et al. 



1. The public key P that is produced contains a description (possibly implicit) 
of a Turing machine V which accepts input S. 

2. For any family of polynomial-time Turing machines {Mf), for all sufficiently 
large k, for {P, S) G G(l^), 

Pr[Mfc(P) = s such that P(s) = ACCEPT] = neg{k) 

Each user U generates a master key pair (Pu,Su) G G(l^) and each orga- 
nization O generates a master public/secret key pair {Po,So) G G£/(l^) using 
asymmetric key generation procedure Cij. 

Organization’s Key Generation: For each type G of credential issued by 
organization O, O generates a key pair {Pq , Sq) G Go(l^) using asymmetric 
key generation procedure Go- In this paper, we assume that each organization 
only issues one type of credential; our results generalize straightforwardly to 
handle multiple credential types per organization. 

Nym Generation: The user U generates a nym N for interacting with organi- 
zation O by engaging in a secure interactive procedure NC between himself and 
the organization. 

Definition 2. Nym generation NC is a secure interactive procedure between 
two parties, a user with master key pair {Pu,Su), and an organization with 
master key pair (Po,So)- The common input to NC is {Pq), U has private 
input {Pu, Su), and O has private input {So)- We assume that nym generation 
is done through a secure anonymous communication channel that conceals all 
information about the user. The common output of the protocol is a nym N for 
user U with the organization. The private output for the user is some secret 
information SIjj q, and for the organization some secret information SI^ q. 

We let N{U, O) denote the set of nyms that user U has established with 
organization O. In this paper we assume that there is at most one such nym, 
although our results can be easily generalized. Similarly, we let N{U) denote 
the set of nyms the user U has established with any organization, and let N{0) 
denote the set of nyms that the organization O has established for any user. 
Communication between a User and an Organization: After a nym is 
established, the user can use it to communicate with the organization, using 
secure nym authentication defined as follows: 

Definition 3. Secure nym authentication is a secure interactive procedure be- 
tween user U and organization O. Their common input to the procedure is 
N G N{U,0). The organization accepts with probability 1 — neg{k) if the user 
can prove that he knows (Pu, Su, Sl(j q) such that Su corresponds to Pu and 
N was formed by running NC with user’s private input {Pu,Su) and private 
output SI^ Q. Otherwise, the organization rejects with probability 1 — neg{k). 

Single-Use Credentials: A single-use credential is a credential that a user 
may use safely once, but if used more than once may allow organizations to 
link different nyms of the user. A user who wishes to use such a credential 
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more than once should request instead multiple copies of the credential from 
the organization. 

Multiple-Use Credentials: A multiple-use credential may be safely trans- 
ferred to as many organizations as the user wishes without having to interact 
further with the issuing organization. 

Credential Issue: To issue a credential to nym N G N{U, O), the organization 
first requires that the user proves that he is the owner of N by running nym 
authentication, and then the organization O and the user U run interactive 
procedure CL 

Definition 4. Credential issue procedure Cl is a secure interactive procedure 
between the user with master public/secret key pair {Pu,Su) and secret nym 
generation information SI\jq, and the organization with master public/secret 
key pair {Po,So) and secret nym generation information SI^q, with the fol- 
lowing properties: 

1. The common input to Cl is (N,Po). 

2. The user’s private input to Cl is {Pu, Sjj, Sl\j q) 

3. The organization’s private input to Cl is {So, SI^ q) . 

4- The user’s private output is the credential, Cu,o- 

5. The organization’s private output is secret information, CSI^q. 

Note that the output of Cl, namely Cu,o, is not necessarily known to the 
organization. 

Credential transfer: To verify that a user with nym N G N{U, O') has a cre- 
dential from organization O, organization O' runs a secure interactive procedure 
CT with the user U. 

Definition 5. Credential transfer procedure CT is a secure interactive proce- 
dure between user U with master public/secret key pair {Pu,Su), nyms N G 
N{U,0) and N' G N{U,0'), corresponding secret nym generation information 
SljjQ and SI^Q,), and credential Cu,o; and organization O' that has mas- 
ter public/secret key pair {Po’,So') and secret nym generation information 
SI^, Q, . Their common input to CT is {N',Pq)- U’s private input to CT is 
{Pu, Su,Cu,o, N, SI^ Q, SI^ Qi) (where N is U’s pseudonym with O). O' has 
private input to CT SI^, q, . If the inputs to CT are valid, i.e. formed by run- 
ning the appropriate protocols above, then O' accepts, otherwise O' rejects with 
probability 1 — neg{k). 

Note that if the credential is single-use, CT does not need to be an interactive 
procedure. The user needs only reveal Cu,o to O' , and then O' will perform the 
necessary computation. 

If the credential is multiple-use, this procedure need not be interactive either. 
The user might only need to compute a function on Cu,o, Pu and Su and hand 
the result over to O' to convince O' that he is a credential holder. 
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Requirements. 

All the procedures described above constitute a secure pseudonym system if 
and only if they satisfy the requirements outlined below. The reader is referred 
to the full version of the present paper for a more rigorous treatment of these 
requirements. 

Each Authenticated Pseudonym Corresponds to a Unique User: Even 
though the identity of a user who owns a nym must remain unknown, we require 
that there exists a canonical Turing machine called the identity extractor ID, 
such that for any valid nym N, given rewindable access to a Turing machine M 
that can successfully authenticate itself as the holder of N with non-negligible 
probability, ID{N, M) outputs valid master public key/secret key pair with high 
probability. Moreover, we require that for each nym, this pair be unique. 
Security of the User’s Master Secret Key: We want to make sure that 
user C/’s master secret key Su is not revealed by his public key Pu or by the 
user’s interaction with the pseudonym system. We require that whatever can be 
computed about the user’s secret key as a result of the user’s interaction with 
the system, can be computed from his public key alone. 

Credential Sharing Implies Master Secret Sharing: User Alice who has a 
valid credential might want to help her friend Bob to improperly obtain whatever 
privileges the credential brings. She could do so by revealing her master secret 
key to Bob, so that Bob could successfully impersonate her in all regards. We 
cannot prevent this attack, but we do require of a scheme that whenever Alice 
discloses some information that allows Bob to use her credentials or nyms, she 
thereby is effectively disclosing her master secret key to him. That is to say that 
there exists an extractor such that if Bob succeeds in using a credential that 
was not issued to his pseudonym, then the secret key of another user who does 
possess a valid credential, can be extracted by having rewindable access to Bob. 
Unlinkability of Pseudonyms: We don’t want the nyms of a user to be link- 
able at any time better than by random guessing. 

Unforgeability of Credentials: We require that a credential may not be issued 
to a user without the organization’s cooperation. 

Pseudonym as a Public Key for Signatures and Encryption: Addition- 
ally, there is an optional but desirable feature of a nym system: the ability to 
sign with one’s nym, as well as encrypt and decrypt messages. 



2.3 Building a Pseudonym System from these Procedures 

If we are given procedures with the properties as above, we can use them as 
building blocks for nym systems with various specifications. To ensure that each 
user uses only one master public/secret key pair, and one that is indeed external 
to the pseudonym system, we need the certification authority. The certification 
authority is just an organization that gives out the credential of validity. The 
user establishes a nym N with the CA, reveals his true identity and then authen- 
ticates himself as the valid holder of A. He then proves that ID{N) = {Pu, Su), 
where Pu is U’s master public key, as the CA may verify. Then the CA issues a 
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credential of validity for N, which the user may subsequently transfer to other 
organizations, to prove to them that he is a valid user. 

In some systems there is no need for a certification authority, because there 
is no need for a digital identity to correspond to a physical identity. For example, 
in a banking system it is not a problem if users have more than one account or 
if groups of individuals open accounts with banks and merchants. 

We refer the reader to the full version of the paper for a comprehensive 
treatment of other useful features a pseudonym system might have. 

3 Constructions of Pseudonym Systems Based on Any 
One-Way Function 

This section focuses on demonstrating that the model that we presented in Sec- 
tionals feasible under the assumption that one-way functions exist. Our theoret- 
ical constructions use zero-knowledge proofs, and therefore they do not suggest a 
practical way of implementing a pseudonym system. Rather, their significance is 
mostly in demonstrating the feasibility of pseudonym systems of various flavors. 
It is also in demonstrating that the existence of one-way functions is a necessary 
and sufficient condition for the existence of pseudonym systems as we define 
them. 



3.1 Preliminaries 



The definitions for the terms such as one-way functions, zero-knowledge proofs 
and knowledge extractors, bit commitment schemes [3, and signature schemes 
^3^9 can be found in standard treatments [J. 

Theorem 1. The existence of one-way functions is a necessary condition for 
the existence of pseudonym systems. 



This theorem follows from the way we defined asymmetric key generation. 
See the final version of this paper for the proof. 

In the constructions of a pseudonym systems presented below, we will need 
to use the fact that existence of one-way functions implies the existence of secure 

Q; and also of zero- 



bit commitment schemes and signature schemes 
knowledge protocols with knowledge extractors ^9. 



3.2 Construction of a System with Multiple-Use Credentials 

Our theoretical construction of a system with multiple-use credentials is a 
straightforward extension of the construction by Damgard f]. 

Suppose we are given a signature scheme (G, Sign, Verify), where G is the 
key generation algorithm; Sign{P K , SK, m) is the procedure that, on input key 
pair {PK,SK) G G{V) and message m produces a signature s, denoted as 
s G apK{ni); and Verify{PK, m, s) is the verification algorithm. 
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Also suppose we are given a bit commitment scheme {Commit, Check) where 
Commit {a, r) is the commitment algorithm that produces a commitment to a 
with randomness r; if c = Commit{a,r) then Check{c, a,r) verifies that c is a 
commitment to a. 

A user U runs G(l^) to create his master public key /secret key pair {Pu,Su)', 
an organization O creates its master public key pair {Pq, Sq) similarly. 

To register with the CA, the user reveals his public key Pu to the CA. The 
CA outputs Cu,CA e (tca{Pu)- 

To establish a pseudonym with an organization O, the user U computes 
Nu^O = Commit{{Pu, Su), Ru,o) where Ru,o is a random string that the user 
has generated for the purposes of computing this pseudonym and which corre- 
sponds to his private output SljjQ. 

To prove that his pseudonym Nu^o is valid and that he has registered with 
the CA, the user proves knowledge of Pu, Su, Ru,0 and Cu,CA such that 

1. Su corresponds to Pu- 

2. Nu,o = Commit{{Pu, Su), Ru,o), 

3. V erifycA{Pu,Cu,c a) = accept. 

The identity extractor ID is the knowledge extractor for the above zero- 
knowledge proof of knowledge that outputs Pu and Su components. 

To issue a credential to a user known to the organization O as N, the orga- 
nization O outputs a signature Cu,o G 0"o(fV)- 

Let the user’s nym with organization O' be N'. To prove to O' that he has 
a credential from O, the user executes a zero-knowledge proof of knowledge of 
Pu, Su, R, R', N and Cu,o G <^o{N) such that 

1. Su corresponds to Pu- 

2. N = Commit{{Pu , Su), R), 

3. N' = Commit{{Pu , Su), R'), 

4. Verifyo{N,Cu,o) = ACCEPT. 

Theorem 2. The system described above is a pseudonym system. 

The proof can be found in the full version of the paper. 

3.3 Construction of a System with Single-Use Credentials 

This is essentially the same construction. The master key and pseudonym gen- 
eration procedures are identical. The difference is that each credential has a 
serial number, which is an additional input in the credential issue and transfer 
procedures. 

4 Practical Constructions 

We will begin this section by describing some well-known constructions based on 
the discrete logarithm problem. We then show how, using the constructions, to 
build a scheme that implements our model of a pseudonym system with one-time 
credentials. 
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4.1 Preliminaries 

Setting We assume that we are working in a group Gq of prime order q, in which 
the discrete logarithm problem and the Difhe-Hellman problems (computational, 
decisional, etc.) are believed to be hard. We also rely on the random oracle model. 

4.2 Building Blocks 

Proving Equality of Discrete Logarithms First, we review protocol 7T, the 
protocol of Chaum and Pedersen Q that is assumed to be a zero knowledge 
proof of equality of discrete logarithms. 



Protocol n for Proving Equality of Discrete Logarithms: 

Common inputs: g, h,g,h & Gq 



Prover knows: 


X € Z* such that h = g^ and h = g^ 


P - 


V : 


Choose r Gr Z*; Send (A = g'" ,B = 


V 


P : 


Choose c Gfl Z*; Send c. 


P 


V : 


Send y = r + cx mod q. 


V : 




Check that g"^ = Ah‘^ and = Bh‘^. 



Note that to obtain TTjv/, the non-interactive version of II, 
set c = 'H{A, B), where H is the hash function. 

This protocol proves both knowledge of the discrete logarithm x, and the 
fact that it is the same for (g, h) and {g, h). The following summarizes what is 
known about such a protocol: 

Theorem 3. If, as a result of executing protocol II , the verifier accepts, then 
with probability 1 — neg{k), the prover knows x such that g^ = h mod p. 

Theorem 4. If, as a result of executing protocol II , the verifier accepts, then 
with probability 1 — neg{k), x\ = X 2 , where xi is such that g^^ = h mod p and 
X 2 is such that = h mod p. 



Conjecture 1. Protocol U is & secure interactive procedure j't'O 



We note that the knowledge extractor E for protocol II just needs to ask 
the prover two different challenges on the same commitment, and then solve 
the corresponding system of linear equations y\ = r + c\X and y 2 = r + C 2 X to 
compute the secret x. 



Non-interactive Proof of Equality of DL. We note that II can be made 
non-interactive (we denote it by II by using a sufficiently strong hash function 
H (for example a random oracle [^) to select the verifier’s challenge based on 
the prover’s first message. 
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Blind Non-interactive Proof of Equality of DL. Clearly, we can obtain a 
transcript of this non-interactive protocol by executing the interactive protocol. 
In addition, we can execute the interactive protocol in such a way that the 
prover’s view of it cannot be linked with the resulting transcript. In protocol F, 
if 7 is selected at random, the transcript produced by F is equally likely to have 
come from any g and any choice of r and c. 

Protocol F-. Producing a Blinded Transcript of Protocol FImi' 

Common inputs and proven knowledge: same as in protocol 7T 

Verifier input: 7 G Z*. 

Verifier wants: use prover of 7T to produce valid transcript of protocol FIj^j 

on input g , h, G = g'^ , H — h'^ . 

Note: Prover behavior is identical to protocol 7T. 

Choose r Gr Z*; Send (A = g'~^B = g'"). 

Choose a,P,£R Z* Let A! = Ag°‘hi^, B' = (Bg°‘hi^V. 

Send c = 7t(A', B') + f3 mod q. 

Send y = r + cx mod q. 

Check that g^ = Ah‘^ and g^ = BtF. 

Note: and 

Output transcript: {{A' , B'),T-L{A' , B'),y + a). 



The above protocol is blind, that is, if the verifier runs it with the prover 
several times and then shows one of the outputs to the prover, the prover will not 
be able to guess correctly which conversation the output refers to, any better 
than by random guessing. The following theorem is well-known; we refer the 
reader to the final version of this paper for a proof: 

Theorem 5. The verifier’s output in protoeol F is independent of the prover’s 
view of the conversation. 

4.3 The Construction 

We are now ready to present our construction based on the building blocks intro- 
duced above. Our construction is similar in flavour to that given by Chen 

High-Level Description. A user’s master public key is g’^ , and the corre- 
sponding master secret key is x. A user’s nym is formed by taking a random 
base a, such that the user does not know log^ a, and raising it to the power x. 
As a result, all of the user’s nyms are tied to his secret x. When a credential is 
issued, we want to make sure that it will not be valid for any secret other than 

X. 

A credential in our construction is a non-interactive proof of knowledge of the 
organization’s secret. If the user uses it twice, it can be linked, since he cannot 
produce another such credential on his own. 



P : 

V >F: 

P — 

V : 

V : 



Pseudonym Systems 195 



Detailed Description. The pseudonym system protocols are implemented as 
follows: 

User Master Key Generation: The user picks his master secret x G h* and 
publishes mod p. 

Organization Credential Key Generation: The organization picks two se- 
cret exponents, si S Z* and S2 G Z*, and publishes mod p and mod p. 
Nym Generation: We describe this protocol in the figure below. 

Pseudonym Generation: 

User U ’s master public key: g^ 

User U ’s master secret key: x 

U : Choose 7 Gr Z*. Set a = g~^ and h = a^. 

U — > O : Send (a, 5 ). 

O : Choose r Gr Z*. Set a = cU . 

O — > U : Send a. 

U : Compute b= a^. 

U < — > O : Execute protocol II to show log^ b = logj b 
U, O : Remember C/’s nym N = (a, b). 

Note that in the special case that O is the CA, the user should 
send (g,g^) instead of (a, 6). 

Communication between a User and an Organization: To authenticate 
nym (a, b), the user and the organization execute a standard secure protocol that 
proves user’s knowledge of log^ b. (E.g. they can run II to prove that log^ b = 
loga b.) 

Credential Issue and Transfer: These protocols are described in the figure 
below. 

Issuing a Credential: 

User’s nym with organization O: (a, b) where 6 = a® 

Organization O ’s public credential key: {g, hi, h-2) where h\ = , h-2 = g'^'^ 

Organization O ’s secret credential key: (si , S2) 

O — >U: Send (A = 6*^R= (06*=^)^!). 

U : Choose 7 Gr Z*. 

O < — > U : Run U to show logj A = log^ h-2 with Verifier input 7. 

Obtain transcript Ti . 

O < — > U : Run U to show log(^^) B = logg hi with Verifier input 7. 

Obtain transcript T2- 

U : Remember credential Cu^o = b~* , A~< , B~* , Ti, T2). 
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Transferring a Credential to Another Organization: 

Organization O ’s public credential key: {g, hi, ft. 2 ) where hi = g ^^ , ft .2 = 5®^ 

User’s nym with organization O' : (a, 5) where b = cU 

User’s credential from organization O: Cu,o = (a^ b' , A', B' , Ti, T 2 ) 

O' : Verify correctness of Ti and T 2 as transcripts for Ujsii 

for showing log^, A' = logg h 2 and log(„,^,) B' = logg hi. 

U < — > O' : Execute Protocol 7T to show log^ b = log^j/ b' . 



The Nym as Public Key for Signatures and Encryption: There are many 
encryption and signature schemes based on the discrete logarithm problem that 
can be used, such as the ElGamal ^3 or Schnorr schemes. 

Security of the Scheme. We prove that the scheme presented above satisfies 
the definition of a pseudonym system given in sectionjin the full version of the 
present paper Below we outline the assumptions under which this follows. 

Recall the setting - a group Gq of order q; access to a random oracle. The 
following assumptions are necessary: 

1. We rely on the Decisional Difhe-Hellman assumption. 

2. We assume that Protocol 7T for proving equality of discrete logarithms is 
secure. 

3. We assume that the following problem is hard: 

Problem 1. Let G be a cyclic group with generator g and of order |G|. Let 
g’” and be given. Furthermore, assume that an oracle can be called that 
answers a query s by a triple (a, a®^, 0 ^“'"®®^^), where a = g^ is a, random group 
element of G. Let this oracle be called for si, S2, . . .. Then, the problem is to 
generate a quadruple (t, 6, where t ^ {0, si, S2, . . .}, and where 

by^e. 

Theorem 5 shows the hardness of Problem 1 with respect to generic algo- 
rithms (as defined by Shoup unless the group order is divisible by a 
small prime factor. 

Theorem 6. Let p be the smallest prime factor of n. The running time of 
a probabilistic generic algorithm solving Problem 1 for groups of order n is 
of order f2{y/p/ . 

Proof Idea. The proof is based on the fact that the event £ that two of the 
computed group elements are equal {£ is called the collision event), has the 
following two properties. First, the event has probability of order 0{T‘^/p), 
where T is the number of steps performed by the generic algorithm. Second, 
given that the event £ does not occur, the algorithm produces a correct 4- 
tuple only with probability 0{l/p). □ 
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Although for any particular group used, there can exist specific (non-generic) 
algorithms solving Problem 1, the generic hardness of the problem is strong 
evidence for the existence of groups for which the problem is hard. 



4.4 Multiple-Use Credentials 

We have not been able to construct a system with multiple-use credentials which 
would completely conform to the specifications of our model. However, with a 
slight variation on the model and a straightforward modification of the scheme 
described above, we can get a scheme with multiple-use credentials. Moreover, 
in this setting we will no longer require the random oracle. 

To implement this, our pseudonym generation and credential issue proce- 
dure will remain the same. As a result, the user will possess Cu,o = (oj b, A, B), 
where A = 6®^, B = and (a, 6) = (a,a^) is the user’s nym with the 

issuing organization. The user can therefore sample, for any 7, the 4-tuples 
fj{Cu,o) = (o^, b'^, A'^ , B^). For any 4-tuple formed that way, for any correctly 
formed pseudonym (o', b'), the user will be able to prove that log^ b = log^/ b' . 
If the issuing organization is required to cooperate with the receiving organi- 
zation, it can confirm that f-y{Cu,o) is a valid credential that corresponds to 
nym (a’’', b"^), or disprove that statement if it is not true. This is as secure as the 
scheme with one-time credentials. 



5 Conclusions and Open Questions 

The present work’s contributions are in defining a model for pseudonym sys- 
tems and proving it feasible, as well as proposing a practical scheme which is 
a significant improvement over its predecessors. Open problems lie in the area 
of identifying useful features for a pseudonym system (some features not men- 
tioned in this extended abstract have been introduced and discussed in the full 
version of the present paper Q); in removing interactiveness in the theoretical 
constructions; and in coming up with good practical constructions that conform 
to our specifications. 
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Abstract. Verifiable secret sharing schemes (VSS) are secret sharing 
schemes dealing with possible cheating by the participants. In this pa- 
per, we propose a new unconditionally secure VSS. Then we construct a 
new proactive secret sharing scheme based on that VSS. In a proactive 
scheme, the shares are periodically renewed so that an adversary cannot 
get any information about the secret unless he is able to access a specified 
number of shares in a short time period. Furthermore, we introduce some 
combinatorial structure into the proactive scheme to make the scheme 
more efficient. The combinatorial method might also be used to improve 
some of the previously constructed proactive schemes. 



1 Introduction 

One important topic in cryptography is how to securely share a secret among 
a group of people. In some cases, many people need to share the power to use 
a cryptosystem. Thus some secret information should be shared by a group so 
that the cryptosystem can be used only if it is permitted by a specified subset 
of the group. The study of how to keep a secure backup of a secret key and 
how to recover it securely has been first studied by Blakley Q and Shamir 
^3 independently. Shamir proposed a polynomial threshold scheme. In a (t, n)- 
threshold scheme, a secret value is shared by n participants such that any t of the 
participants can reconstruct the secret value by putting their shares together, 
but any t—1 participants cannot get any information about the secret value. In 
such a scheme, an adversary needs to compromise at least t locations in order 
to learn the secret, and corrupt at least n — t—1 locations to destroy the secret. 

In many situations, such as cryptographic master keys, data files, legal docu- 
ments, etc., a secret value needs to be stored for a long time. In these situations, 
an adversary may attack the locations one by one and eventually get the secret 
or destroy it. To prevent such an attack, proactive secret sharing schemes are 
proposed. Proactive security for secret sharing was first suggested by Ostrovsky 
and Yung in In Q they presented, among other things, a proactive polyno- 
mial secret sharing scheme. Proactive security refers to security and availability 
in the the presence of a mobile adversary. Herzberg et al. specialized this 
notion to robust secret sharing schemes and gave a detailed efficient proactive 
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secret sharing scheme. In their scheme, a secret value is shared by n servers. The 
mobile adversary is able to attack all the servers during a long period of time. 
However, since the corrupted servers can be rebooted, in any time period there 
are only a subset of servers that are corrupted. “Robust” means that in any time 
period, the servers can reconstruct the secret value correctly. 

The scheme in is based on Shamir’s polynomial threshold scheme, thus 
most aspects of the scheme are unconditionally secure. However, their scheme 
depends on verifiable secret sharing schemes based on QQ which depend on 
some cryptographic assumptions. The security of the scheme in ^ is based on 
the hardness of solving discrete logarithm. In the scheme of the privacy of 
the secret is unconditionally secure, but the correctness of the shares depends 
on a computational assumption. In a sense, these two schemes complement each 
other. 

The purpose of this paper is to provide a new proactive secret sharing scheme 
which is unconditionally secure, i.e., the security of any part of the scheme is 
not based on any cryptographic assumption. Let S be the set of possible secret 
values, where jS”! = q. Then unconditional security of the scheme means that at 
any time the adversary cannot guess the shared secret s G S with probability 
better than 

We first propose an unconditionally secure verifiable secret sharing scheme. 
This scheme has some similar features to the absolute VSS in Q. Then we 
propose several protocols to make it proactive. Following from the method of ^3, 
the lifetime of the secret is divided into periods of time in the proactive scheme. 
In each time period, the n shares will be renewed while the secret remains the 
same. In this way, a mobile adversary who is able to attack (learn or corrupt) at 
most b shares in a time period cannot learn any information about the secret in 
the long lifetime. This scheme is also robust, i.e., the secret can be reconstructed 
at any time. 

Furthermore, we introduce some combinatorial structures in the scheme so 
that the scheme will be more efficient. With the combinatorial structure, most 
of the computation of the system will depend on the parameter b. Thus there is 
a “trade-off” between the computation and the value of b: when b is smaller (the 
ability of the adversary is more limited), the computation takes less time. Thus 
our scheme is more efficient in the situation when the number of the possible 
corrupted servers are much smaller as compared to the total number of the 
servers in the system. On the other hand, our combinatorial method might be 
easily adapted to the scheme of to make the scheme more efficient. 

The rest of this paper is arranged as follows. In Section Q we give some 
preliminaries and the main settings of the system. Section ^ describes our new 
verifiable secret sharing scheme. We also propose an anonymous VSS in a subsec- 
tion. Section^describes the proactive scheme without combinatorial structure. 
Section 5 introduces the combinatorial structure and describes how to apply it 
to the proactive scheme. 
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2 Preliminaries 



2.1 Previous Work 

Proactive refers to the security of the scheme in the presence of a mobile adver- 
sary who may corrupt all participants of the scheme throughout the lifetime of 
the system but cannot corrupt too many participants during any short period 
of time. Such a mobile adversary was first considered by Ostrovsky and Yung in 



The motivation of is to combat mobile viruses. The scheme requires the 
participants to constantly exchange messages and to be able to erase parts of its 
memory. A polynomial secret sharing proactive scheme is proposed which uses 
the verifiable secret sharing scheme of 

Herzberg et al. further discussed proactive secret sharing schemes and 
gave a detailed practical scheme. In their scheme the lifetime is divided into peri- 
ods of time. At the beginning of each time period, the share holders engage in an 
interactive update protocol which includes a share recovery protocol and a share 
renewal protocol. At the end of the period, each shareholder holds completely 
new shares of the same secret. The secret will not be computed during the update 
phase while it can be reconstructed at any time. They used the polynomial-based 
method from for the renewal protocol. They also proposed a polynomial- 
based method for share recovery protocol. The verifiable secret sharing schemes 
they used are from 

There are also many papers that discuss proactive security, see e.g., 
and their references. Our discussion will mainly follow the papers 



in mi 



2.2 The Setting 

We will follow the setting of the scheme in We assume that there is a 

system of n servers Pi, P2, ■ ■ ■ , Pn, which are connected to a common broad- 
cast channel such that messages sent through this channel instantly reach every 
server. We also assume that the system is synchronized, i.e., the servers can 
access a common global clock, and that each server has a local source of ran- 
domness. To make things simpler, we assume that there are private channels 
between each pair of servers and that messages sent by broadcast are safely 
authenticated. With these assumptions, we are able to focus on the proactive 
scheme itself. 

There is an adversary which can corrupt b servers during any time period. 
Corrupting a server means learning the secret information in the server, mod- 
ifying its data, sending out wrong message, changing the intended behavior of 
the server, disconnecting it, and so on. Since the server can be rebooted, the 
adversary is a mobile one. 

A secret value s G GF{q) will be shared by the servers through the scheme. 
The value of s needs to be maintained for a long period of time. The life time 
is divided into time periods which are determined by the global clock. At the 
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beginning of each time period the servers engage in an interactive update proto- 
col. The update protocol will not reveal the value of s. At the end of the period 
the servers hold new shares of s. The mobile adversary who corrupts b servers in 
a time period cannot get any information about the secret value s. The system 
can reproduce s in the presence of the mobile adversary at any time. 

We consider unconditional security in this paper, which means that the ad- 
versary cannot guess the secret with probability better than - if the secret 
s G GF{q). 



3 Verifiable Secret Sharing 

Since secret sharing schemes were proposed initially by Shamir and Blak- 
ley Q, research on this topic has been extensive. In the “classic” secret sharing 
schemes, there are assumed to be no faults in the system. Tompa and Woll Q, 
and McEliece and Sarwate first considered schemes with faulty participants 
and gave partial solutions for that problem. In their schemes, the dealer is always 
assumed honest. Chor et al. Q first defined the complete notion of Verifiable 
Secret Sharing (VSS), and gave a solution which is based on some cryptographic 
assumption. In a VSS, each holder of a share can verify that the share is consis- 
tent with the other shares. Thus both the dealer and other participants can be 
verified in such a scheme. There are two aspects of the security in a VSS. One 
is the security of the secret and the other is the security of the verification. 

There are many papers which have discussed VSS recently. Most schemes use 
zero-knowledge proofs, e.g., Others use cryptographic assump- 

tions such as the hardness of discrete logarithm, see ^^3. proposed a simple 
and efficient VSS, but it based on some “collision resistance” assumption. On 
the other hand, many known VSS are not easy to adapt for proactive property. 

The VSS in are used in proactive schemes in^^^J. ^3 used the 

VSS frorn^J which used some zero-knowledge proofs, paused the VSS of 
Feldman Hand Pedersen The security of the scheme in Q is based on the 
hardness of solving discrete logarithm. In the scheme of ^3, the privacy of the 
secret is unconditionally secure, but the verification depends on a computational 
assumption. 

In H it was shown that in any unconditionally secure VSS, b < ^. Thus the 
VSS with b > ^ will either depend on some cryptographic assumption or have 
small probability of errors. In this section, we will propose an unconditionally 
secure VSS with b < ^ — 1, which is simpler and more efficient than the scheme 
in Q. Moreover, our scheme has the threshold property that any coalition of 
t — 1 participants cannot get any information about the secret value (regardless 
of whether the coalition consists of good or bad participants), a property which 
the scheme of H does not have, since secret information may be revealed during 
the “share” protocol. Another feature of our scheme is that it requires less secret 
information to be communicated by the dealer, and the dealer is not required to 
take part in the protocol after the initial distribution of secret information. 
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3.1 Definition 

Now we give a formal definition of VSS, as follows. 

Suppose there are a dealer D and n other participants P\, P 2 , • ■ ■ , Pn con- 
nected by private communication channels. They also have access to a broadcast 
channel. There is a static adversary A that can corrupt up to b of the partici- 
pants including the D. Here static means that the b participants controlled by 
the adversary are fixed. 

Let 7T be a protocol consisting of two phases Share and Reconstruct. Let S 
be the set of possible secret values. At the beginning of Share, the dealer inputs 
a secret s G S'. At the end of Share each participant Pi is instructed to output 
a Boolean value veri. At the end of Reconstruct each participant is instructed 
to output a value in S. 

The protocol tt is an unconditionally secure Verifiable Secret Sharing protocol 
if the following properties are hold: 

1. If a good player Pi outputs veri = 0 at the end of Share then every good 
player outputs veri = 0; 

2. If the dealer is good, then veri = 1 for every good Pi. 

3. If at least n — b players Pi output veri = 1 at the end of Share, then there 
exists an s' G S such that the event that all good Pi output s' at the end of 
Reconstruct is fixed at the end of Share and s' = s if the dealer is good; 

4. If |S| = (7 and s is chosen randomly from S, and the dealer is good, then any 
coalition of at most t — 1 participants cannot guess at the end of Share the 
value s with probability better than i. 



3.2 The New VSS 

In this subsection we provide a new unconditionally secure VSS which will be 
used in our proactive scheme later. 

Suppose there is a dealer D and n participants Pi, 1 < i < n, where n > t+3b 
and t > b. Let S = GF{q) be a finite field and let w be a primitive element in 
GF{q). In the following protocol, all the computations are in the field GF{q). 
We first state the share phase as follows. 

Share 



1 . 



When D wants to share a secret value s G S', he chooses a random symmetric 
polynomial 

t-i t-i 

f{x,y) ='^'^aijx''y\ 

i—0 j—0 



where ooo = s and = aji for all i,j. Then, for each k, D sends hk{x) = 
f{x,iv^) to Pk through a private channel. 

2. After receiving hk{x), each Pk sends hk{oJ^) to Pi for 1 < I < n,{l yf k) 
through a private channel. 
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3. Each Pi checks whether for 1 < k < n, {I ^ k). If Pi finds 

that ^ hi{u}^), then Pi broadcasts (l,k). 

4. Each Pi computes the maximum subset G C {1, • • • , n} such that any ordered 
pair (/, k) G GxG is not broadcasted. If |G| > n—b, then Pi outputs veri = 1. 
Otherwise, Pi outputs vevi = 0. 

It is obvious that every good participant computes the same subset G in 
the end of Share. Next we consider the reconstruct phase. Note that although 
the adversary is static, he could provide correct information in Share phase but 
wrong information in Reconstruct phase. 

Reconstruct 

1. Each Pi sends hi{0) to Pk, where i G G. 

2. After receiving hi{0), Pk computes a polynomial /fc(0, y) such that /fc(0, w*) = 
hi{0) for at least n — 26 of the data he received. This can be done efficiently 
using methods of 

3. Pk computes and output s' = /fc(0,0). 

In order to prove that the protocol is an unconditionally secure VSS, we need 
the following lemma. 

Lemma 1 Suppose there are T polynomials hi(x), 62 ( 0 ;), • • • , hxix) with degree 
at most t — 1, where T > t, such that hi(jjj^) = hj{uj'‘) for all i,j. Then there 
exists a polynomial h{x) of degree at most t — 1 such that h(u'') = 6 ^( 0 ) for all 
i,l < 'I T. Equivalently, any t of the shares 6^(0), 1 < i < T, determine the 
same secret K = 6(0). 

Proof First we note that for any t-subset I = {A, Z 2 , • • • , it} C {1, 2, • • • , T} 
and any hj{x), where 1 < j < T, we can use the Lagrange interpolation formula 
(see ^ 3 ) to compute 



6j(0) = ^ hj{u'')bi = ^ hi{w^)bi, 

i^I i^I 



where 



n 



UJ 



k^I 



, 



1 < i < t. This comes from the condition hi(u^) = hj(uj'') for any i,j G 

Now suppose that / and J are two different t-subsets of{l,2,---,T}. Then 
we can compute a polynomial hj{x) such that hj(u}'') = 6^(0) for all i G I, and 
then 6/(0) can be obtained by the Lagrange interpolation: 



6 /( 0 ) = ^ 6 ,( 0 ) 6 ,. 

iGl 
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By the above discussion we have 

hi{0) = Y,b^h,{0) 

i^I 

iei jeJ 

= '^bj^hi{uj^)bi 
jeJ iGl 

= T.^M0) 

jeJ 

= hj{0). 

□ 



We are now in a position to prove the following theorem. 

Theorem 2 The scheme of this section is an unconditionally secure verifiable 
secret sharing scheme. 

Proof We prove that the above scheme satisfies the conditions of the VSS as 
follows. 

1. If a good player Pi outputs veri = 0, then the size of the maximum subset 
G is at most n — b—1. Thus every good player will output “0”. 

2. If the dealer is good, then the good player receives f{x,uj^). Since f{x,y) 
is symmetric, f{uj\uj^) = f{uj^,uj^) for all good players Pi. Thus all good players 
are in the subset G. Therefore veri = 1 for each good player Pi. 

3. Suppose at least n — b players output “I” at the end of the Share. Then 

there is a subset G of size n — b such that no one in the subset complained 
the others. Since we assume that there are at most b bad players, there are at 
least n — 2b good players in G, who all have consistent shares. By Lemma H 
any t-subset of the good players can compute the same value K. It is easy to 
check that if the dealer is good, then we have K = s, the secret value. Further 
at most b out of the n — b shares in G are not consistent with the secret K. Since 
n — b> t + 26, the algorithms in can be used to find the maximum consistent 

set of shares and thus determine K. 

4. Without loss of generality, we assume that the coalition knows the values 
of hi{x), 62 ( 0 ;), ••• , ht-i{x). It is easy to show (see, e.g., Q) that for any value 
s' G GF{q), we can find bij G GF{q), where 6qo = s', bij = bji, 0 < i,_;/ < t — I 
such that if 

t-i t-i 

f{x,y) = J2J2bij^"y^^ 

i—0 j—0 

then /'(x, = hk{x) for fc = I, 2, • • • , 6 . □ 

Remark. This scheme is modified from Blom’s key predistribution scheme (see 
Q for the details). For simplicity, our description used a Reed-Solomon code 
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instead of general MDS codes. It is straightforward to generalize our scheme by 
using MDS codes. 

3.3 An Example 

We display a toy example in this subsection. Let q = 13, w = 2,n = 9,t = 3 
and 6 = 2. First suppose the dealer D is good. First D selects a polynomial as 
follows: 

/(x, y) = 3 + 9a; + 2a;^ + 9y + 2y^ + 8xy + llxy'^ + llx'^y + Ax^y^ . 

Then D sends the vector (vi,V 2 ,V 3 ) to the players as follows, each of which 
determines a polynomial v\ + V 2 X + v^x^: 

(3,4,1) 
h2 ^ (6,9,6) 

63^ (8,10,8) 

64^ (9,2,6) 

65^(12,11,4) 

6e^ (9,12,8) 

67^ (6,11,9) 

6g^ (12,10,10) 

69^(7,12,1) 

Suppose that only P\ and P 2 are bad and send wrong data to the other 
players. Then the pairs broadcasted are of the form (1, z), (2, i), (z, 1) or (z,2). 
So the good players will find G = {3, 4, 5, 6, 7, 8, 9} and output “1”. Since we 
assume that there are at most 2 bad players, all the good players will output “1” 
if the dealer is good. On the other hand, the player Pi chooses “0” or “1” only 
depending on the broadcasted pairs, so all good players will output the same 
value of vexi- 

Now suppose that there are at least 7 players who output “1” . Since there are 
at most 2 bad players, it is true that the subset G is found. Suppose, for example, 
G = {1,2, 3, 4, 5, 6, 7}. Then all the good players in G possess consistent shares 
regardless of whether D is good or bad. However up to two of these players 
may be bad, and send incorrect shares during Reconstruction. Thus in the 
Reconstruction phase, there are at least 5 consistent shares held by each of the 
players. Thus each good player will compute the same polynomial f{x, 0) using 
the methods of 

3.4 VSS without Dealer 

Secret sharing without dealer means that there is no dealer in the scheme, who 
knows and distributes the secret. Secret sharing without dealer is first considered 
in ly. One such secret sharing scheme is considered in 

We can remove the dealer from our scheme as follows. The other properties 
of the scheme are the same as in the previous subsection. 

Share 
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1. Each Pfc chooses an independent random symmetric polynomial 

t-i t-i 
i—0 j—0 

where ooo = Sfc and = aji for all i,j. Then Pk sends h^^\x) = 
to Pi through a private channel. 

2. After receiving h^^\x), each Pi sends to Pm for 1 < m < n through 

a private channel. 

3. Pm checks whether hm\uj’‘) = for 1 < ^ < n. If Pm finds that 

hm\uj^) 7 ^ then Pm broadcasts 

4. For every k m, each player Pm computes the maximum subset Gk such 
that for any pair {m,l) G GfcxGfc, (fc; m, is not broadcasted. If|Gfc| >n—b, 
then Pm puts the value fc in a list C. 

5. If |£| > n — 6, then Pm outputs vexm = I and computes his share as 

hm = ^h^^{x). 

IGC 

Otherwise, Pm refuses the shares and outputs vexm = 0. 

The reconstruct phase is the same as the previous scheme. Note that in this 
scheme the shared secret is 

s = '^s„ 
i&C 

In this scheme, each player in turn plays the part of the dealer. Thus the security 
of scheme follows from Theorem^ We need only to show that each good player 
has the same list £, which is obvious. 

Remark. As we indicated before, our VSS is modified from Blom’s key pre- 
distribution scheme. In the original scheme, there is a dealer to construct the 
schemes. Using the methods of this section, we obtain a key predistribution 
scheme without dealer. 

4 New Proactive Scheme 

In this section, we describe our proactive secret sharing scheme without combina- 
torial structure. We will add combinatorial structures in this scheme to improve 
the efficiency of the scheme in next section. 

4.1 Initialization 

In the initial step, we assume that there is a dealer to set up the scheme. After 
the initialization phase, the dealer will no longer be needed. 

In the initialization, we use the share phase of the VSS described in the 
Section^ but we assume that t > b+1. The first four steps are the same. Then 
we do the following. 
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5. If at least n — b of the servers output vert = 1, then the dealer erases all the 
information about the scheme on his end. Otherwise, the dealer reboots the 
whole system and initializes the system again. 



4.2 Share Renewal 

In the share renewal phase, all good servers do the following: 

1 . Each server Pi selects a random symmetric polynomial 

t-i t-i 

r^^\x,y) = 

^—0 j — 0 

where roo = 0 and rij = Vji for all i,j. 

2. Pi sends (x) = (x, uj^) to Pk for k = 1, 2, • • • , n by a private channel 

and broadcasts hg\x) = r(‘^(x,0). 

3. Pk checks whether /iq^( 0) = 0 and hj!\o) = hg^oj^). If the conditions are 

satisfied, then Pk computes and sends Pm the value Otherwise Pk 

broadcasts an accusation of Pi. 

4. Pm checks whether hm for the values of I not accused by 

n — b servers of the system. If the equation is not true for more than b values 
of k, then Pm broadcasts an accusation of Pi. 

5. If Pi is accused by at most b servers, then he can defend himself as follows. 

For those Pi that Pi is accused by. Pi broadcasts h^p{x). Then server Pk 
checks whether and broadcasts “yes” or “no”. If there 

are at least n — b — 2 servers broadcasting yes, then Pi is not a bad server. 

6. Pm updates the list of bad servers C by including all values I for which Pi 
is accused by at least 6+1 servers, or found bad in the previous step. Then 
Pm updates its shares as 

^m(^) ^ hmi.X^ + 



for all k ^ C. 

Remark. We can remove the private channels in step 2, since our scheme is also 
a key predistribution scheme and the server Pi and Pj can use hi{j) = hj(i) as 
a key to communicate securely. 

To check the security of the renewal phase, first we note that any coalition of 
at most 6 servers cannot get any information about any shares except their own. 
In fact, a server Pi only knows {x) and (x). Since 6 < t — 1, the coalition 
of 6 servers knows at most t—1 polynomials which cannot reveal A^\x, y) (see, 
e.g., I). Secondly, from the protocol we know that every good server should 
have the same list C. Therefore, the good servers will keep consistent shares 
after renewal. 
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Note that a good server Pi can be accused by at most b servers. In this case, 
Pi will broadcast b polynomials in its defense. Thus Pi will broadcasts total 5+1 
polynomials. Since t > 5 + 1, these information will not reveal y). On the 

other hand, suppose Pi gives Pi a wrong share, i.e., the share Pi received is not 
consistent with at least other servers (the majority of good servers). Then 
Pi will accuse Pi in step 4, since > 5. If Pi broadcasts a correct share in the 
defense, then Pi can correct his share. Otherwise, Pi will be found to be bad. 



4.3 Recover a Share 

When a server is corrupted or replaced, it needs to be rebooted and thus it needs 
to recover the secret shares. 

We first provide a protocol, to detect the corrupted servers, which we call 
detection. 

Detection 

1. Pi computes and sends 5/(w^) to Pk for fc = 1, 2, • • • , n by private channels. 

2. Pk checks whether hi{uj^) = hk{o/)- Pk then broadcasts an accusation listk 
which contains those I such that hi{uj^) ^ hk{tJ‘) or hi{uj^) was not received. 

3. Each good server updates the list C so that it contains those I accused by at 
least 5+1 servers of the system. 

After running Detection, the system will recover the shares for all server Pi, 
where I G C. The recovery protocol is as follows. 

1. For each I G C, every good server Pi computes and sends hi{oj^) to Pi. 

2. Upon receiving the data. Pi computes a polynomial hi{x) such that hi{u}^) = 
hk{uJ^) for the majority of k it received, using the algorithms of {J. Pi sets 
hi{x) as its shares. 



4.4 Reconstruct the Secret 

The reconstruction protocol is similar to the Reconstruction of VSS introduced 
in SectionH We need only to change the first two steps as follows. 

1’ For each good server Pi, Pi sends 5^(0) to Pk, where k is not in the list C. 
2’ After receiving 5^(0), Pk computes a polynomial /fc(0, y) such that /fc(0, w®) = 
hi{0) for at least n — 25 of the data he received. 



5 Combinatorial Structure 

In this section, we will introduce some combinatorial structure into our scheme. 
The combinatorial structure provides a predetermined arrangement of the servers 
which permits the possibility of reducing the computation of the scheme. 
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5.1 Set Systems 

A set system is a pair {X, B), where A is a set of n points and is a collection 
of subsets of X called blocks. 

We will use a set system with the following properties, where t < ^ — 1: 

1. \B\ > t for any B G B. 

2. For any subset F C X with |F| < b, there exists a, B G B such that FnB = 0. 

It is easy to see that such a set system exists. For example, we can choose 
B to be all the t-subsets of X. However, there are often better set systems (i.e., 
set systems containing fewer blocks) . The following definition is well-known (see, 

e.g., B). 

Definitions A collection T of k-subsets of {l,...,n} (called blocks) is an 
(n, k, b) -covering if every b-subset of {1, ... ,n} is contained in at least one block. 

It is easy to see that if (A, T) is an (n, n — t, 6)-covering, then the set system 

{{l,...,n}\T:TGT} 

is a set system satisfying our purpose. There are several efficient constructions 
of (n, n — t, 5)-coverings in which can be easily implemented by a computer. 

5.2 Applying Set System to the Proactive VSS 

The idea of using the set system is to reduce the computations for the share re- 
newal and share recover protocols. In the scheme of Section J share renewal and 
share recover used the data from all the participants. However, these operations 
can be carried out using the data from t good servers. For example, in share 
renewal protocol, any t good servers can renew the shares, since the shares are 
polynomials of degree at most t — 1. In protocol of Section^ every good server 
provides information to renew shares. So there are redundant computations. If 
the system can determine t good servers, then the protocol will be more efficient. 
Note that there are at least 3t -I- 1 good servers in the system. Thus we can save 
at least one third of the computations. On the other hand, we should be very 
careful when the t good servers are selected, since the adversary is mobile. The 
good server could turn to bad at any time. Thus in the scheme of this section, 
we will actually select correct information instead of good servers, although we 
will still use “good server ” for convenience. 

Now let us use the set system to improve our proactive scheme. Suppose 
(A, B) is a set system satisfying the conditions of subsection ^3 where A = 
{l,2,---,n}, and B — {Hi, i? 2 , • • • , Hg}. The set system is published so that 
each participant can consult it. 

Note that in our scheme, in any phase there is a list C containing all the bad 
servers. By the property of the set system, there is a block B which contains 
only good servers. If the system can determine one of the “good” blocks, then 
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the system can renew the shares or recover the shares only using the data from 
these servers. We will call these servers the members of an executive committee. 
For a list C of bad servers, the system can decide following list of blocks: 

■ ■ ■ , Bi^, such that Bi. (1 C — j = 1,2, ■■■, e, and 1 < zi < Z 2 < • • • < 
*e ^ s. These blocks are called executive committee candidates. Note that the 
adversary is mobile, therefore we cannot guarantee that these candidates contain 
only good servers in the next time period. 

The proactive secret sharing scheme with combinatorial structure works as 
follows. The initialization is the same as that in Section | In each time period 
the system does the following. 

1. Run Detection to obtain the list C of bad servers and the executive com- 
mittee candidates: i 

2. If an executive committee has not been found, then for next executive com- 
mittee candidate B, each Pg G B does the following: 

(a) Selects a random symmetric polynomial 

t-i t-i 

r^^\x,y) = ^^rya;y, 

j—0 

where roo = 0 and = rji for all i,j, and sends h^^\x) = r^^\x,u^) to 
Pfc for fc = 1, 2, • • • , n, fc yf 5 by private channel and broadcasts (x) = 

A^\x, 0). 

(b) Pk checks whether hg^^O) = 0 and hl^\o) = hg^\uj^) for g € B. If 
the conditions are satisfied, then P^ computes and sends Pm the value 

Otherwise Pk broadcasts an accusation of Pg. 

(c) Pm checks whether hm\oj^) = for g € B. If the equation is not 

true, then Pm broadcasts an accusation of Pg. 

(d) A member in B is accused by at least 6-1-1 servers is bad. If a member in 
B is accused by at most 6 servers, then it can defend itself. If no member 
in B is bad, then B is found to be the executive committee. 

3. The system runs the recovery protocol to recover the shares for the servers 
in C. 

4. Each server Pm updates its shares as 

6m(:^) hmi^x^ -\~ 6^^ (x) 

for all g G B. 

The reconstruction protocol is the same as that in Section^ 

5.3 Applying Combinatorial Structures to Other Schemes 

The proactive secret sharing scheme proposed by Herzberg et al. in Q is similar 
to our scheme in Section^ Thus it is straightforward to modify our method with 
combinatorial structures to their scheme. In general, suppose a proactive secret 
sharing scheme has the following properties: 
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1 . Information from any t good servers can be used to renew shares and recover 
shares. 

2. A VSS exists in which any server can use this VSS to send data which can 
be verified by the system. 

3. There is a detection protocol to find the bad servers. 

4. There is a defense protocol so that an accused server can be determined bad 
or good by the system. 

5. There are renewal and share recovery protocols. 

6. There is set system (A, B) satisfying the conditions of Subsection^3 

Then we can use the following scheme for renewal and share recovery proto- 
cols. 

1. Run the detection protocol to obtain a list C of bad servers and the executive 
committee candidates: Bi ^ , Bi^ ,■■■ , Bi^ . 

2. If executive committee has not been found, then for next executive committee 
candidate B, each Pg G B does: 

(a) Send recovery information rc^. to Pk for each k G C in the system and 
send renewal information rnf to Pi for each I in the system by VSS. 

(b) The system checks the correctness of ref. and rnf. If some mistake is 
found, then Pg is accused. 

(c) A member in B is accused, then it can defend itself and the system can 
decide whether it is bad. If no member in B is bad, then B is defined to 
be the executive committee. 

3. The Pk G C recovers its share using {ref ■. g G B}. 

4. Each server Pi renews its share using {rnf : g G B}. 

It is readily checked that the proactive secret sharing scheme of satisfies 
all the properties we needed. Thus we can use the combinatorial method to 
improve their scheme. The details are omitted here. 
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Abstract. In the world of mobile agents, security aspects are exten- 
sively being discussed, with strong emphasis on how agents can be pro- 
tected against malicious hosts and vice versa. This paper discusses a 
method for concealing an agent’s route information from being misused 
by sites en route to collect profile information of the agent’s owner. Fur- 
thermore, it is shown that the protected route resists attacks from a 
single malicious host and from colluding malicious hosts as well. 



1 Introduction 

Mobile agents are becoming more and more important for Internet based elec- 
tronic markets. In many scenarios, mobile agents represent customers, salesmen 
or mediators for information, goods and services [1]. Agents are autonomous 
programs, which, following a route, migrate through a network of sites to ac- 
complish tasks or take orders on behalf of their owners. Without any protection 
scheme, a visited site may read an agent’s data and thus collect information 
about the agent’s owner, e.g. its services, customers, service strategies, collected 
data, etc. To avoid such a situation, the amount of data accessible to a visited 
site has to be restricted as much as possible [2]. 

In this paper, we concentrate on an agent’s route information, i.e. the address 
list of sites to be visited during a trip. The owner provides its agent with an initial 
route. On its travel, the agent works through the route stage by stage. Protecting 
the route guarantees that none of the visited stations can manipulate the route 
in a malicious way or can get an overview of other sites the agent’s owner is 
contacting. To repulse malicious programs like Trojan horses, the identity of an 
agent becomes known to all visited sites [3] , anonymous agents [4] are not dealt 
with in this paper. In the following we use the concept and terminology of the 
ALOHi^software package [5]. 

^ The ALOHA (Agent Local Help Application) environment allows its users to easily 
define, send, receive and evaluate agents. 



Howard Keys and Carlisle Adams (Eds.): SAC’99, LNCS 1758, pp. 215-^^^ 2000. 
© Springer-Verlag Berlin Heidelberg 2000 



216 



Dirk Westhoff et al. 



2 Related Work 

Methods that protect an agent against attacks can be categorized into those 
which prevent attacks and those which detect attacks. 

Cryptographic traces [6] detect illegal modifications of an agent by a post- 
mortem analysis of data the agent collected during its journey. State appraisal [7] 
mechanisms protect an agent’s dynamical components, especially its execution 
state, against modifications. When an agent reaches a new site, the appraisal 
function is evaluated passing as a parameter the agent’s current state. 

Tamper-proof devices [8] are hardware based and therefore not suitable in 
open systems. Software based approaches include the computation of encrypted 
functions [9] and code scrambling [10]. Unfortunately the first approach can only 
be applied to polynominal and rational functions. In [10] the code of an agent is 
re-arranged to disguise the agent’s functionality. 

Although onion routing [11] is not an agent specific approach, it uses partial 
encryption similar to our method. Onion routing allows anonymous connections 
and is used to protect a variety of Internet services against eavesdropping and 
traffic analysis attacks. In contrast, beside concealing the route, our solution 
detects attacks that modify an agent’s route but allows legal route changes. 



3 Framework 

An agent is an autonomous program which acts on behalf of its owner. According 
to its route, it visits sites linked together via a communication network. An agent 
is created, sent, finally received and evaluated in its owner’s home context. At a 
visited site, the agent is executed in a working context. To save costs, an agent 
usually does not return to its home context before it has worked off its route; 
thus during the agent’s journey its home site has not to be connected to the 
communication network all the time. To forward an agent to its next site or to 
extend an agent’s route, each visited site needs access to a certain part of the 
route. A site is not allowed to remove a not yet visited site from the initial route 
and thus, e.g., exclude sites from offering their services to the agent. When a 
visited site extends a route, all added sites must become aware of the fact that 
they have not been on the initial route, and thus may, e.g., restrict the agent’s 
access rights and functions, e.g. for electronic cashing. When a site changes a 
route, the change must be uniquely be associated with the site. To avoid arousing 
suspicion, as soon as a site detects an attack against an agent, it has to send the 
agent back to its home context. 

In the following, we present a concept which reduces the route information, 
that becomes visible to a visited site, to a minimum, and which protects the route 
against malicious changes. In addition the concept is flexible enough to handle 
route extensions during the agent’s journey. Our concept carefully considers 
efficiency aspects like computational complexity, additional network traffic, etc. 
Because of the latter, the concept of a trustworthy center to be visited between 




Protecting a Mobile Agent’s Route against Collusions 



217 



each two consecutive sites, is not taken up. Each site is only given access to the 
address of its predecessor and its successor sitej 

Additionally to the route, the agent includes other components like profile, 
binary code, mobile data and a trip marker per journey. Agents have to be pro- 
tected against passive, reading attacks, and active attacks that modify an agent’s 
functionality or even fully destroy it. This paper concentrates on the route and 
its protection against attacks performed by one single context as well as on 
attacks performed by collusions of cooperating malicious contexts. 



4 Protecting the Initial Ronte 



When an agent migrates from working context Ci to working context c^+i , all its 
objects are encrypted and thus protected against passive attacks. Before starting 
an agent, (except in very special cases [9]) a working context has to decrypt parts 
of the agent and thus make the agent vulnerable against active or passive attacks. 
Thus, the agent’s route should be protected as much as possible. 

An unprotected route r = ip{c{) || ... || ip(c„) is a concatenated list of 
Internet addresses ip{ci). To abort an agent’s journey, each site has to know 
the Internet address ip{h) of the home context h, which therefore is stored in 
plaintext separate from the protected route. 

The home context h signs data relevant for each working context to be visited by 
means of a signature S and encrypts the agent’s route using an asymmetrical en- 
cryption method E and the z-th working context’s public key Ci for i = 1, 
Applying encryption in a manner as it is known from onion routing network [11] 
for achieving untraceable communication h composes 



r = Ee 



E„. 



h, ip{c 2 ), Sh{h, ip{ci), ip{c 2 ),t, [• ■ •]). 

ip(ci ), ip(C3 ), Sh (ip(ci ), ip(C 2 ), ip(C3 ), t, [...]) , 



ip{Cn-2),ip{Cn),Sh{ip{ 
Ee„ [ip{Cn-l), EoR, Sh {ip{Cn- 



Cn-2), ip(c„-l), ip(Cn),t, 
i),ip(cn), EoR,t~)J ... 



•]). 



( 1 ) 



^ In general the communication protocol itself automatically provides a receiver with 
the address of the sender. 

® The ALOHA-software package uses RSA as asymmetrical encryption E. The sig- 
nature S is based on a combination of SHA and RSA. We suppose that problems 
according to generation of asymmetrical key-pairs, certification and distribution of 
public keys axe solved. 
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Thereby, t denotes a trip marker which is unique for each agent’s journey and 
the EoR entry indicates the end of the route. Thus, the route contains the en- 
crypted addresses of all working contexts that shall be visited and the signatures 
to prove the integrity of the route. 

During the agent’s journey, a working context Ci decrypts the Internet address 
ip{ci+i) of its successor and ip{ci-i) of its predecessor, its relevant signature and 
all the remaining ciphertext by using its private key di that corresponds to the 
public key Cj. Then, each site removes the decrypted address and signature from 
the agent’s route. All other route entries are hidden from the actual working 
context. 

With the help of digital signatures, active attacks can be detected. The signa- 
ture of the address which is signed by the agent’s home context h and presented 
to the actual working context Cj, proves that the route entry for Ci has not 
been modified. The influence of Internet address ip{ci) and trip marker t on the 
signature guarantees the actual working context that itself is part of the initial 
route. The predecessor’s address ip{ci-i) is taken into account for the signature’s 
computation in order to avoid a special collusion attack to be explained later. 

The uniqueness of the trip marker t is necessary to prevent replay attacks. 
Otherwise, a malicious working context could replace the complete route of the 
actual agent with a copied route of an agent’s earlier journey. 

With the help of the EoR entry, working context c„ realizes that it itself is 
the final entry of the agent’s route. Via EoR and t in the signature, working 
context c„ is able to check whether these data have been generated by h and 
whether it itself was really included into the initial route. 

Signatures must be encrypted by the home context as well, otherwise, if 
the agent carried the signatures in plaintext, under certain circumstances an 
attacker, who knows t, would be able to reconstruct the complete route by ar- 
ranging and testing combinations of possible addresses. Such an attack would 
be feasible if the number of relevant sites is small. 

In the following, we discuss methods with regard to active attacks which 
modify the functionality of an agent’s route. Such attacks can be classified into 
those where the attacker solely tries to cheat without the help of any other 
working context and those where the attacker are acting in collusion with other 
dishonest working contexts. 

The onion-like signature of the route ensures that all the desired working 
contexts have to be visited in order that the further route information can be 
decrypted. The signatures which depend on all instantaneously existing route 
data allow the detection of attacks as early as possible. 

If context Ci receives an agent, then Ci is the only one that can reveal the 
successor’s address; if the signature check is positive, Cj can be sure that 

— it was included in the initial route, 

— it received the agent from the correct predecessor, 

— the successor’s address is correct, 

— all further data contained in the route are not compromised. 
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The home context knows the last entry c„ of the agent’s route, i.e. the context 
from which it finally expects the agent. This prevents any other working context 
in the initial route from returning the agent too early. 

If a malicious working context c, intends to cheat without the help of any 
other working context, e.g. by deleting entries from the route, by replacing entries 
in the route or by adding new entries to the route, the next honest working 
context will detect such an attack immediately. 

If Ci deletes ciphertext whose corresponding plaintext refers to a working 
context Cj with j > i + 1, working context Cj+i will detect immediately by 
signature check that the route was compromised. If Ci tries to skip Ci+i, it would 
not be able to reveal the address ip{ci+ 2 )- So the malicious context c, is only 
able to forward the agent to a randomly chosen address. 

If Ci adds new addresses or if c, replaces entries either in plaintext or in ci- 
phertext, it would never be able to use the right signature key, and so the attack 
would become obvious as early as possible. 



a) Initial route 



b) Broadcasting of the current route 





Fig. 1. Attack on an atomic encrypted route under collusion. 



In all previous attacks, it was assumed that a malicious working context does 
not exploit the help of another dishonest working context. In general, a home 
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context does not know if there exist collusions, and which working contexts are 
acting in collusions. 

The performance of the secured route, presented in expression (1), becomes 
clear when considering collusions. If the route entries are encrypted in an atomic 
way instead of an onion-like structure and if the initial route contains at least two 
members Ci,Cj of a colluding group, which are not adjacent {j > i + 1), the first 
visited dishonest context Ci could act in the following way: having obtained the 
agent, Ci broadcasts copies of the current route to its accomplices. Afterwards, 
the accomplices try to reveal by sequential decryption if they are members of 
the initial route. If accomplice Cj obtains a reasonable result after decryption of 
the presented ciphertexts, Cj found itself as a valid member of the initial route. 
Now, Cj informs Ci that it is also member of the agent’s route and Ci forwards 
the agent to Cj by skipping all those route entries in the initial route lying in 
between Ci and Cj [see fig. 1]. No other honest working context Ck, j < k < n, 
would ever be able to detect this attack. 

Skipping honest contexts exploiting the power of collusions can be avoided 
if the route is secured like in expression (1). In an onion-like protection scheme, 
a malicious working context Ci is only able to reveal more than one succeeding 
route entries c^+i , . . . ,Cm if all these adjacent contexts belong to the colluding 
group. In this case, a copy of the route can be forwarded context by context 
as long as the succeeding context is member of the colluding group|Then, the 
last member of the colluding group Cm found in that way is able to order the 
agent from ci and can forward it to the honest context Cm+i which is not able 
to detect that contexts c^+i, . . . , Cm-i have been skipped. 

Even if such an attack is possible, accomplices Ci, . . . , Cm-i are not motivated 
to act in the described way. In contrast to the atomic encrypted approach, no 
honest context can become the victim of such an attack. 



a) Initial route 



b) Route after attack 





Fig. 2. Attack on onion-like protected route under collusion. 



^ The probability for a malicious context to find one of its accomplices as successor 
in the initial route depends on the number of members in the colluding group, the 
total number of contexts in the whole agent system and, when using the atomic 
encrypting approach, the current length of the route. In most cases, with an onion- 
like protected route, this probability is significantly smaller than in case of atomic 
encryption. 
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Instead of skipping dishonest contexts, malicious contexts may try to add 
new dishonest contexts to the initial route. If there are two adjacent malicious 
contexts Ci and c^+i in the initial route, Ci can forward the agent via new dis- 
honest contexts cj, . . .,c^ to Cj+i [see fig. 2]. This attack can not be detected 
by the following honest context Cj+ 2 . But again, no honest context can become 
the victim of such an attack. 

The influence of the predecessor’s address becomes obvious if one considers a 
similar attack. Because of the dependence of the signature from the predecessor’s 
address, a malicious context Ci is not able to forward the agent via accomplices 
c'l, ... ,c'^ to an honest context Cj+i [see fig. 3]. 



a) Initial route 



b) Route after attack 





Fig. 3. Detectable attack on onion-like protected route under collusion. 



To sum it up and to illustrate the strength of the protection scheme, neither 

— skipping of honest working contexts, nor 

— replacing honest working contexts by new contexts, nor 

— adding honest contexts 

is feasible without being detected afterwards, either by an honest working context 
or by the home context. 

Of course, visited working contexts are able to exchange information about 
agents (predecessors, successors) a posteriori to perform route analysis and to 
obtain information about the visited contexts. For example, a malicious working 
context Ci can broadcast to all its accomplices that it was visited by a concrete 
agent with predecessor Ci_i and successor Cj+i. If the agent later visits a col- 
laborating context Cj, this context can reply its predecessor Cj-i and successor 
Cj+i to context c,. The larger the number of members in the colluding group, 
the higher the probability becomes that at least one accomplice is member of 
the initial route. But with growing probability this attack becomes more and 
more costly, because of the increasing number of candidates to which the agent 
information has to be forwarded. Without hiding the agent’s identity such an 
attack can never be prevented by means of cryptography. 

To summarize the properties of the presented scheme; onion-like encryption 
provides concealment of route entries at a maximum degree. Without collusions. 
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every attack can be detected as early as possible by the next visited working 
context. But even under collusions, no honest context can be skipped and no 
maliciously added or replaced honest working contexts can be visited without 
being noticed. If the home context receives its agent without an error message, 
it can be sure that its agent visited all honest contexts in the initial route, and 
as many addresses as possible are kept secret. Furthermore, every visited honest 
working context can verify if it is a member of the initial route. 

In the following we will examine cases in which routes are legally extended 
during an agent’s journey. 



5 Extending the Initial Route 

If for providing its service a context Ci on the agent’s initial route needs the 
cooperation with other contexts ,c^, then working context Cj should be 

allowed to extend the initial route in a legal way. Of course Ci should not be 
allowed to delete unvisited entries from the initial route. 

To protect the home context’s interests, the new entries do not include any 
confidential information! On the other hand, Ci may be interested in protecting 
its new entries. Let c^,...,cj^ be these new entries. Like a home context, Ci 
encrypts the new route extension and includes the route extension as a prefix to 
the current route r: 



r 



X 



Eexi 

Eex2 









ip{ci),ip{c2), Si {ip{ci),ip{ci),ip{c2),t, E^^2 [•••].>"). 
ip(cf ), ip(cf ), Si (ip(cf ), ip{c2 ), ip(cf ),t, E^^s • -]t) , 

_1 i‘P{Cm — 2'} 5 5 Si , ip{^Cjji_\ ) , , t, E^Xjri [■ • ■] 5 5 

[ip{c^.x),EoX,Si[ip{c^.x),ip{c^),EoX,t,r)'\ ... || r 



(2) 



In contrast to the protection scheme of the initial route, the signature in the 
extension of the route depends on the extension and on the current initial route 
r. The new parameter EoX indicates the end of extension. If Ci extends a route 
it must not delete its successor and the corresponding signature from the current 
initial route. 

Having visited all sites , . . . , c;^ of a route extension, the presented concept 
provides that the agent returns to Ci [see fig. 4]. 

The new contexts get just informed that a is on the initial route. 
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Ci+l 




xV 

Cm# 



Fig. 4. Extending a route. 



When the agent returns to c,, the context obtains its original part of the initial 
route and can now decrypt its successor’s address and check the signature. If it 
does not receive the agent from the expected context c^, then it knows that 
something illegal happened during the agent’s journey at the extended route. 

The route extension inherits the protection properties of the scheme described 
in section 4. All signatures in the route extension stem from Ci and the route 
extension ends with an EoX entry. If a context receives such an EoX entry it 
sends the agent back to c^. To detect attacks as early as possible, the remaining 
ciphertexts of the initial route are included into Cj’s signature. Thus, when the 
agent returns from to Cj, working context Ci can be sure, that all honest sites 
of the route extension have been visited. 

The concept of route extension may be extended: each context of a route 
extension may be allowed to extend the route extension itself as well. In all 
these cases the presented scheme guarantees that either all honest sites in the 
initial route as well as all allowed route extensions are visited, or the journey is 
explicitly aborted because of an attack or a not accessible site. 

6 Unreachable Working Contexts 

In many systems, the gain of technical data protection conforms with a loss of 
functionality and flexibility. If a working context c^+i is unavailable, because 
it terminated regularly/irregularly or the network connection was interrupted, 
then context Ci, that can decrypt only the Internet address ip{ci+i), may not 
just skip over context Cj+i. In such a case, context Ci has to choose one of the 
following strategies: 

— As long as Cj+i is unreachable, the agent waits in c^. 

— The agent aborts its journey and migrates back to its home context. 

— Having tried to reach Ci+i for a certain time, the agent migrates back to its 
home context. 

When an agent aborts its journey, the reason should become obvious to its 
home context. When the probability to reach a context falls below a certain level, 
the route protection mechanism should be changed to become more flexible, 
though from the data protection point of view this may be disadvantageous. 
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7 Conclusions 

In this paper we presented a method for the concealment of an agent’s route in- 
formation. Furthermore, we showed that our method resists all presented attacks 
performed by a single malicious context. But even under attacks of malicious con- 
texts acting in collusion no honest working context can become a victim. Thus, 
when receiving its agent without an error message the home context can be sure 
that all honest contexts have been visited. All working contexts, and naturally 
also the honest ones, can detect if the home context intended its agent to visit 
them. 

Additionally, in cases a working context needs the cooperation of other con- 
texts for providing a service, the method allows legal route extensions. If a con- 
text extends an agent’s initial route, having visited all sites of the route extension 
the agent returns to the context that has extended the route, and proceeds on 
its initial route. Following this pattern, one can develop several levels of route 
extension and use the described method for protecting the agent’s route against 
attacks. 



8 Future Work 

To precisely identify that contexts which ran the attack, more in-depth research 
is needed. We are also extending our route protection mechanisms to other agent 
components. Mobile data can be handled similar to routes: at least parts of 
them must only become accessible to specific contexts. In contrast, the agent’s 
binary code has fully to be decrypted at each site. By using checksums and a 
kind of third party protocol [13], [14], attacks can be detected and afterwards 
malicious contexts can be forced to forward correct binary code. Nevertheless 
such a protocol can not verify if a working context really started an agent. Maybe 
detection objects [15] ensure this. 
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Abstract. Historic look at design principles and requirements for a 
practical key management communication protocol. Refinement of terms, 
threat environment and limitations, and necessary features. Dynamic 
computational time and network round-trip time are well integrated with 
protocol specification. First use of anti-clogging tokens to defend against 
resource attacks. 



1 Motivation 

Photuri^ is a session-key management protocol featuring authenticated key 
agreement with confirmation, defense against resource clogging, forward secrecy 
of the session-keys, and privacy protection for the parties. 

Photuris was based on currently available tools, by experienced net- 

work protocol designers with an interest in cryptography, rather than by cryp- 
tographers with an interest in network protocols. Designing and implementing 
protocols for an open public network requires consideration of a distributed 
threat environment that includes bandwidth limitations and propagation la- 
tency, ubiquitous casual packet snooping and malicious interference, as well as 
faulty hardware and software implementation errors. 

The definitions of protocol features and threats are more stringent than found 
in recent survey publications Distinctions are made between similar 

failure modes that have different causes to assist in analysis and specifying cor- 
rective measures. Other distinguishing terminology is used, with the usual de- 
notative interpretation, conveying concepts that have more colorful terms (such 
as “spam” and “smurf” ) in the network operations community. 



^ “Photuris” is the latin name for the firefly. “Firefly” is in turn the name for the 
USA National Security Administration’s (classified) key exchange protocol for the 
STU-HI secure telephone. 



Howard Keys and Carlisle Adams (Eds.): SAC’99, LNCS 1758, pp. 226-^^^ 2000. 
@ Springer-Verlag Berlin Heidelberg 2000 
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2 Fundamental Principles 

2.1 End-to-End 

The ultimate objective of Internet Security is to facilitate direct Internet Protocol 
(IP) end-to-end connectivity between sensitive hosts and users over the Internet. 
Users will rely on Internet Security to protect the confidentiality of the traffic 
they send across the Internet and depend on it to block unauthorized external 
access to their internal hosts and networks. 

Users must have confidence in every Internet Security component, including 
key management. Without this confidence, users may erect barriers (“firewalls”) 
that impede legitimate use of the Internet, or forego the Internet entirely. 



2.2 Keys 

Internet Security must not place any significance on the easily forged IP Source 
field. It relies instead on proof of possession of secret knowledge: that is, a cryp- 
tographic key. 

However, secure manual distribution and maintenance of these keys is often 
cumbersome and problematic. User distribution often leads to long-lived keys, 
with concommitant opportunity for compromise of the keys. 



2.3 Decentralized 

Widespread deployment and use of Internet Security is possible through the use 
of a key management protocol. For example, Kerberos can generate host- 

pair keys for use in Internet Security, much as it now generates session-keys for 
use by encrypted telnet and other “kerberized” applications. 

The Kerberos model has some widely recognized drawbacks. Foremost is the 
requirement for a highly available on-line Key Distribution Center (KDC), with a 
database containing every principal’s secret-key. This entails significant security 
risk. 

Public-key cryptography enables decentralization. Communicating 

entities can generate session-keys without real-time communication with any 
third party. 

3 Threat Environment 

Photuris establishes short-lived session- keys for Internet nodes that frequently 
access or are accessed by a large and unpredictable number of other nodes. 
In addition to stationary wired land-line installations, Photuris is intended to 
support mobile, wireless and satellite environments. Common activities include 
creating virtual private networks over the common public Internet, transient 
connections for mobile users and networks operating over bandwidth-limited 
links, and commercial transactions between numerous clients and servers. 
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3.1 Delivery of Messages 

The Internet Protocol implements best-effort datagram delivery. Data- 

grams can be damaged, discarded, or duplicated. Successive datagrams may take 
different paths through the internetwork, resulting in differences in round trip 
timing, and re-ordering of the datagrams. 



3.2 Eavesdropping 



The Internet model of operation 



assumes that nodes listen to each other 



on their local network, and that intermediate nodes carry traffic between such 
networks. Every message will be passively monitored by many other parties. 



3.3 Interdiction 

An interceptor might selectively prevent the transmission of a correct message 
from one party to another. 



3.4 Modification 

An interceptor might selectively prevent the transmission of a correct message 
from one party to another, and modify the message, before sending it to the 
latter party. This is sometimes called a “Monkey In The Middle” (MITM). 



3.5 Races 

An interloper can observe the passage of a message from one party to another, 
and quickly send a bogus message (to either party), before the next correct 
message arrives (from the other party) . When the transmission path of successive 
datagrams can vary, this race condition is unpredictable. 



3.6 Reflection 

An interloper can observe the passage of a message from one party to another, 
extract important fields, and send another message with those fields to that 
originating party. 



3.7 Replay 

An interloper can observe the passage of a message from one party to another, 
extract important fields, and send another message with those fields to the latter 
party. 
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3.8 Resource Clogging 

The easiest and most common attack experienced in the Internet is an exces- 
sive number of datagrams sent to a target network or node. Processing these 
datagrams can exhaust the computing resources of a target node. This form of 
attack often has a randomized, forged IP Source. 



3.9 Resource Flooding 

A large number of datagrams might exceed the available bandwidth of a link, 
preventing the passage of legitimate datagrams. This form of attack usually has 
a sub-network broadcast IP Source. These are difficult to distinguish without 
knowledge of the specific topology of the (distant) sub-network. 

4 Threat Limitations 

Internet Security is not a panacea. It is not intended to prevent or recover from all 
possible security threats. Rather, it is designed to protect against most probable 
and feasible attacks. 

In particular, non-cryptographic attacks are outside the scope of this docu- 
ment. For example, cutting a link, jamming a radio signal, or tampering with 
a computer device might be important security threats, but are not within the 
province of a key management protocol. 

By the very nature of a key management protocol, the threat of Interdiction 
can be reduced to a non-cryptographic attack. Prevention of key manage- 
ment traffic is no more harmful than prevention of normal data traffic. In a secure 
environment, no normal traffic will flow without successful key agreement. 

The Resource Flooding B3 denial of service attack (exceeding the band- 
width of a link) is another non-cryptographic attack. These infr astructure at- 
tacks are best dealt with through other means, such as However, the 

use of Internet Security firewalls around vulnerable links (between links of sig- 
nificantly different bandwidth) can change the effect of the attack to Resource 
Clogging 1^3, where Internet Security provides a tractable solution. 



4.1 Anonymity 

Internet Security is not expected to function in an environment where the identi- 
ties of the principals are concealed from each other. Authenticated key agreement 
is antithetical to promiscuously accepting “anonymous” null and/or unverifiable 
identities. 

The effectiveness of any provision of anonymity is unknown. Some folks have 
asserted that traffic analysis is sufficiently thorough to determine the parties 
to any transaction. Unfortunately, thus far these analysts have refused to give 
concrete details. 
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4.2 Multicast 

Key management is more difficult in a multicast environment. The IP Desti- 
nation indicates a potentially large and disparate group, rather than a single 
node. 

Senders to a multicast group may share common a Security Parameters In- 
dex (SPI), when all communications are using the same security configuration 
parameters. In this case, the receiver only knows that the message came from a 
node knowing the SPI for the group, and cannot authenticate which member of 
the group sent the datagram. 

Multicast groups may also use a separate SPI value for each IP Source. As 
each sender is keyed separately, data origin authentication is also provided. 

A participating node is not necessarily in control of the SPI selection process. 
A single node or cooperating subset of the multicast group may work on behalf 
of the entire group to set up a Security Association. 

It is anticipated that Photuris would be used first to establish a distribution 
SPI and session-key, and that another orthogonal key distribution mechanism 
will use that SPI to send the group keys. This is a matter for future research. 
Such mechanisms are outside the scope of this document. 



4.3 Multi-user Hostility 

Internet Security protects against threats that come from the external network, 
not from mutually suspicious users of the nodes themselves. In essence, this is 
another non-cryptographic mode of attack that warrants further elaboration: 

— A secure multi-user operating system is able to protect its resources from 
hostile users, and prevent one hostile user from damaging the resources con- 
trolled by another user. 

— A secure multi-user operating system incorporates strong support for user- 
oriented discretionary access controls. 

— If the operating system has any security vulnerability, such that internal 
information may be revealed or the information of one user may be in- 
advertently disclosed to another user, then there is no basis for separate 
user-oriented key management. 

It has been suggested that the Photuris exchange could also be established 
between particular application or transport processes associated with a user of 
a node. This is a matter for future research. Such mechanisms are outside the 
scope of this document. 

Successful use of application, transport or user-oriented keying requires a 
significant level of operating system support. Use of multi-user segregated ex- 
changes likely requires added functionality in the transport API of the implemen- 
tation operating system. Such mechanisms are emphatically outside the scope 
of this document. 




Photuris: Design Criteria 



231 



5 Design Requirements 

The fundamental role of this key management protocol is to verify the values 
exchanged, while ensuring that the resulting keys are not known by another 
party. 



5.1 Strength 

It is required that it be computationally infeasible for any unintended party 
to discover the mutually computed shared-secret during the lifetime of the key 
management exchange. 

While it might seem obvious that a computer security protocol must be 
computationally secure, this requirement defines the extent of its cryptographic 
strength. That is, the minimal requirement is related to the lifetime of the 
ephemeral exchange itself, although other goals might extend the desired life- 
time. There is no requirement that the strength be effectively infinite. Typical 
exchange lifetimes are measured in minutes or hours. 

When coupled with the features described later, this minimal requirement im- 
poses conservative design limitations on the exchange messages and key deriva- 
tion techniques. While it is preferable that the strongest available method be 
used, a light-weight requirement allows the protocol to be securely deployed in 
cellular telephones, and other low computational power devices. 

In practice, the estimated strength of the computed shared-secret is chosen 
to match the cryptographic strength of the session-keys for the chosen security 
parameters. These relative strengths are measured in time rather than entropy. 
The protocol must transparently support increasing amounts of entropy corre- 
sponding to adversary improvements in computational power. 



5.2 Confirmation 

Explicit confirmation is required for completion of each phase of the protocol. 

While it might seem obvious (to an experienced network protocol designer) 
that the protocol run is not complete until the parties agree it has completed, 
there abound numerous examples of theoretical key agreement protocols without 
this important property. Typical network protocols execute a three-way hand- 
shake for both initiation and termination of a communication session. 

This requirement ensures that the protocol is robust against duplication, loss 
and re-ordering of messages, and effectively prevents many reflection and replay 
attacks. 

5.3 Authentication and Authorization 

Each party must successfully verify the exchanged protocol values before using 
any resulting keys. 
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It has been shown that secure key agreement must be coupled to 

authentication. Each party needs assurance that an exchanged key is not shared 
with an imposter. 

In addition to ensuring protocol correctness, this requirement allows the re- 
sulting keys to be associated with access permissions and authorization policies. 
When using asymmetric (public/private key-pair) identities, it is possible that 
an active interception and modification attack will use entirely valid certificates. 
Operators should be suspicious when the peer identities are all certified by a 
single entity, such as the regional security agency equivalent. This attack can 
only be prevented through rigorous authorization policy enforcement. 



6 Design Features 



Photuris establishes short-lived session-keys between two parties, without pass- 
ing the session- keys across the Internet. These session- keys directly replace the 
long-lived secret-keys (such as passwords and passphrases) historically config- 
ured for security purposes. 



The basic Photuris protocol 



utilizes these existing previously con- 



figured secret-keys for identification of the parties. This is intended to speed 
deployment and reduce administrative configuration changes. 

Photuris is independent of any particular party identification method or cer- 
tificate format. Support for symmetric key party identification is required to be 
implemented, and asymmetric key party identification is optionally supported 
by extensions 

In addition to establishing session-keys, Photuris is easily capable of gener- 
ating high quality unpredictable secrets. This facility can be useful to augment 
or expand lower quality user symmetric secret-keys, and to substitute for com- 
putationally expensive asymmetric public/private- key operations. 

Photuris has been designed: 



— for frequent exchange of limited lifetime session- keys between parties. 

— for associating security parameters with these session-keys. 

— to thwart certain types of denial of service attacks on node resources. 

— to maximize computational efficiency. 

— to scale to a large number of networks and nodes. 

— to support the use of a variety of authentication methods, and facilitate the 
exchange of many identification types. 

— to protect the privacy of the parties and the associated security parameters. 

— to provide these services with minimal administrative configuration and user 
effort. 



6.1 Forward Secrecy 

Many security breaches in cryptographic systems have been facilitated by de- 
signs that generate traffic session-keys (or their equivalents) well before they are 
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needed, and then keep them around longer than necessary. This creates many 
opportunities for compromise, especially by insiders. A carefully designed key 
management system can avoid this problem. 

The rule is to avoid using any long-lived keys (such as a public/private key- 
pair) to encrypt session-keys or actual traffic. Such keys should be used solely 
for identification (entity authentication) purposes. Theft of the key used to au- 
thenticate key management exchanges might allow the thief to impersonate the 
party in future exchanges, but itself would not decode any past traffic that might 
have been recorded. 

Session-keys for traffic authentication and encryption should be generated 
immediately before use, and then destroyed immediately after use, so that they 
cannot be recovered. Key generation values should not be directly derived from 
the values of any previous session- keys. 

Photuris utilizes cryptographic hashing algorithms for its key generation 
pseudo-random functions. The initializing data values are carefully arranged 
to avoid related key analysis. 

Session-keys are derived from large, unpredictable data values. At least two 
of these values are secret: 

1. the computed shared-secret. This is based on short-term secret values. In 
theory, it is possible that the shared-secret could be recovered (computa- 
tionally) from the publically exchanged values. 

2. authentication key(s) associated with the parties. These involve medium to 
long-term secret values. In practice, it is more likely that the authentication 
key(s) would be recovered (by theft or coercion) from the parties. 

This combination of multiple disparate secret values ensures that computa- 
tional discovery of session-keys through cryptanalysis of the key management 
system requires the solution of multiple “hard” problems. 

6.2 Perfect Forward Secrecy 

Photuris goes to considerable lengths to achieve perfect forward secrecy 
When the authentication key(s) are periodically destroyed, and the destruction 
is sooner than the feasible recovery of the shared-secret, the derived session- keys 
are not recoverable from the exchange. 

This goal raises the desirable strength for the computed shared-secret, to the 
expected lifetime of the authentication key(s). 

6.3 Privacy Masking 

Concealing the correspondents from other parties is often desirable for confi- 
dential traffic, especially where this would reveal the location of a mobile user. 
Although each IP datagram carries a cleartext IP Destination, the ultimate des- 
tination can be hidden by “laundering” it through an encrypted tunnel. The IP 
Source could be hidden in the same manner. If the tunnel IP Source has been 
dynamically allocated, it provides no useful information to an eavesdropper. 
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Hiding Identities. This leaves the identifying information that the parties send 
for authentication. The identities can be easily protected using a privacy-key 
based on the established shared-secret. Message padding conceals variations in 
identity lengths. This prevents an eavesdropper from learning the identities of 
the parties, either directly from names in certificates or by checking against a 
known database of public keys. 

Nota Bene: The terminology is a play on words. Masking in compu- 
tational algorithms is often applied to the hiding or extraction of field 
values. Masking in social venues is a physical device to hide identity and 
protect privacy. 

This privacy masking is distinguished from party anonymity where 

one of the parties refuses to identify itself to the other. Mutual verification of 
authentication and authorization is fundamental to the security of this protocol. 

Caveats: The scheme is not foolproof. By posing as the Responder, an 
adversary could trick the Initiator into revealing its identity. 

The attack requires the adversary to (1) gain access to a physical trans- 
mission link and race the Responder, or (2) subvert Internet routing for 
the same purpose. These attacks are considerably more difficult than 
passive vacuum-cleaner monitoring. Moreover, unless the adversary can 
steal the authentication key belonging to the Responder, the Initiator 
will discover the deception when verifying the exchanged values. 

It is not possible for an Initiator to similarly trick the Responder. The 
Responder will verify the Initiator Identification before returning its own 
identity. 

Inhibiting Cryptanalysis. In addition to more obvious benefits, hiding the mes- 
sage fields inhibits cryptanalysis of session-key generation by reducing the num- 
ber of known fields. 

Also, privacy masking conceals the attributes associated with the visible 
traffic Security Parameters Index (SPI). Message padding conceals variations in 
attribute lengths. When multiple transform algorithms are implemented, hiding 
attribute choices may inhibit traffic cryptanalysis. 

Preventing Forgery. In real time transaction environments, such as banking, 
it can be even more critical that protection be provided against forgery. The 
confidentiality of the transaction might only be needed for a short period of 
time, yet protection against forgery will be needed for a relatively long period 
of time. Hiding the message verification fields prevents an adversary from direct 
verification of forgery attacks on the authentication function. 

However, unlike the Station-To-Station authentication protocol 
the security of the message exchange is not dependent on hiding of the verifi- 
cation fields. Instead of unilateral signatures over public values, Photuris uses 
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keying material contributed by both parties. In effect, the derived verification- 
keys are session-keys for the exchange, and share the property of multiple “hard” 
problems. 



6.4 Resource Defenses 

Protecting sensitive data against compromise while in transit over the Internet 
is necessary, but not sufficient. The network and computing resources themselves 
must also be protected against unauthorized access, malicious attack or sabotage. 

To grant access to authorized users regardless of location, it must be possible 
to cheaply detect and discard bogus datagrams. Otherwise, an adversary intent 
on sabotage might rapidly send datagrams to exhaust the node’s CPU or memory 
resources. 

Using Internet Security authentication facilities, when a datagram does not 
pass an authentication check, it can be discarded without further processing. 
This is easily done with manual (null) session-key management between trusted 
nodes at relatively little cost, given the speed of cryptographic hashing functions 
compared to public- key algorithms. 

Unfortunately, such a trusted node will have only a fixed number of keys 
available. These keys will tend to have long lifetimes. This entails significant 
security risk. 

Automatic key management is necessary to generate short-lived session-keys 
between parties. But, there is a potential Achilles heel in the key management 
protocol. 

Because of the use of CPU-intensive operations such as modular exponentia- 
tion, key management schemes based on public- key cryptography are vulnerable 
to Resource Clogging 1^3 . Although a complete defense against such attacks 
is impossible, Photuris features make them much more difficult. Resistance is 
accomplished with multiple, successive, inter-dependent layers. 

Anti-clogging Tokens. Path validation is achieved through the exchange of unique 
“cookies” in the first phase. These tokens are included as an exchange identifier 
{Mjfi) in every subsequent message. 

This Cookie Exchange provides a weak form of message origin authenti- 
cation and verifies the presence of network communications between the parties, 
thwarting the saboteur from using random IP Source addresses. The simple vali- 
dation of these cookies uses the same level of resources as other Internet Security 
authentication mechanisms. 

This forces the adversary to (1) use its own valid IP address, or (2) gain 
access to a physical transmission link and appropriate a range of IP addresses, 
or (3) subvert Internet routing for the same purpose. The first option allows 
the target to detect and filter out such attacks, and significantly increases the 
likelihood of identifying the adversary. The latter two attacks are considerably 
more difficult than merely sending large numbers of datagrams with randomly 
chosen IP Source addresses from an arbitrary point on the Internet. 
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Caveats: The cookie exchange does not protect against an interloper 
that can race to substitute another cookie or an interceptor that 

can modify and replace a cookie ^3. As noted earlier, these attacks 
are considerably more difficult than passive vacuum-cleaner monitoring. 
Moreover, unless the adversary can steal the authentication key belong- 
ing to the Responder, the Initiator will discover the deception when 
verifying the exchanged values. 

Exchange Identifier. The message exchange identifier (Mjn) consists of an or- 
dered pair (both cookies). The Responder cookie (cr) is dependent on a time- 
variant secret (Kcr), the Initiator cookie (c/), an anti-replay exchange counter 
(C"), and other implementation specific factors. It should not be possible to 
successfully Reflect or Replay an earlier cookie from either party. 

Exchange State. In addition to the obvious benefits, path validation inhibits 
exhaustion of memory and storage resources. No storage state is created in the 
Responder until after a successful three-way handshake. 



Validating Messages. Initial integrity checking of every message is provided by 
the UDP E 



length and checksum, inhibiting casual message Modiflcation 
In the later phases of the exchange the combination of the 

privacy mask with checking of the message padding values prevents appending 
modification. Chaining of successive verification values in calculation of message 
validation and resulting session keys aids in preventing Reflection and 

Replay 



6.5 Scalability 

A common predilection in the theoretical cryptological community is an ex- 
pressed desire to eliminate “interactiveness”, and otherwise minimize the num- 
ber of messages between parties. That appears to arise from the unwarranted 
assumption this would reduce the opportunity for interference. 

However, in the Internet Security environment, an adversary that can inter- 
fere with any message can probably interfere with all of the messages. The key 
management protocol can only protect against late comers through verification 
of the whole message exchange. 

Pacing Messages. Interactivity can distribute computations over time, utiliz- 
ing inherent latency associated with geographic network topology. For a local 
network, there are few nodes with low latency between them. As the network 
environment expands, so does the round-trip time of the message exchanges, 
affording an opportunity to employ larger computational effort between passes, 
or to support a larger number of nodes with the same effort. 

In Photuris, each computationally expensive operation involves a separate 
message. As computational or network resources are available, the message pac- 
ing naturally varies, and prevents synchronization between multiple Photuris 
exchanges. 
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Reduced Computation. In addition to the obvious benefits, this arrangement 
grants an opportunity for pre-computation of a public-value to be used in mul- 
tiple closely spaced Photuris exchanges. The pre-computed value can be sent 
immediately, allowing parallel computation of the resulting shared-secret during 
the round-trip time. 



6.6 Simplicity 

The hallmark of successful Internet protocols is that they are relatively simple. 
This aids in analysis of the protocol design, improves implementation interoper- 
ability, and reduces operational considerations. 

In Photuris, each message has a single purpose. Message fields are organized 
in the order that they are processed. Similar message fields appear in the same 
order in each message. 

No more than one optional feature is included in any message, and such 
options are listed at the end of the message. The format of options is the same 
as in other Internet protocols, so that implementation code is familiar and can 
be re-used from other projects. 

Although abundant combinations of algorithms offer great flexibility, only a 
few have been selected for inclusion in the underlying protocol. Choosing these 
selected schemes in advance allows intensive review of characteristics and po- 
tential interactions. This analysis can promote confidence in the security of the 
implementations. 

7 Conclusion 

Photuris provides a scalable solution for session- key management. Comprehen- 
sive resource defenses ensure that deployment is robust. Provision of privacy 
masking and forward secrecy raise a strong barrier against cryptanalysis of the 
key management system. 

The distinguishing terminology developed here is used to clarify “Photuris: 
Design Rationale”. Elaboration on the design of the messages will be found there. 

A Message Summary 

In Photuris, the traditional Alice (A) and Bob (B) are called the Initiator (I) 
and Responder (R) instead. The following sections describe an exchange where 
both parties have asymmetric keys, resulting in a pair of secret identities and 
associated symmetric secret-keys. 

When the parties already have existing secret keys (pre-configured or gener- 
ated by an earlier exchange), the Secret Exchange may be omitted. 

The Secret Exchange and SPI Messages may also flow in the other direction 
(from Responder to In itiator) . Only the Initiator to Responder form is illus- 
trated. See and for further details. 
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A.l Cookie Exchange 

The Initiator begins the exchange. The Responder provides an exchange counter 
(C"), a list of available exchange schemes (So), and a unique exchange identifier 
{Mir). 

(1) I^R:ci,0,C 

(2) I ^ R: Mir,1,C',So 

C = previous counter (or zero) 

C = assigned counter (usually C" = C + 1) 
c/ = Initiator Cookie 

cr = H{Kcr, ci, C", So, Initiator I PSource , . . .) 

Kcr= Responder cookie secret 
Mir = ci\\cr 

So = list of offered schemes 

A. 2 Value Exchange 

The Initiator selects a scheme {Ss), and completes the initial three-way hand- 
shake by returning the correct counter {C') and exchange identifier {Mir). The 
parties also exchange their public values ( 5 ®, g^) and lists of available attributes 
{Aoi, Aor). a shared-secret ( 5 ®^) is calculated. 



(3) I^R-.MiR,2,TBVi,g^,Aoi 

(4) I^R:MiR,3,TBVR,gy,AoR 

Aoi = Initiator list of offered attributes 
Aor = Responder list of offered attributes 
Ss = scheme selected from So 
TBVi = C"||5s 
TBVr = zero (reserved) 

VVi = TBVi\\g^\\Aoi\\TBVR\\gy\\AoR\\So 
VVr = TBVR\\gy\\AoR\\TBVi\\g^\\Aoi\\So 

A. 3 Secret Exchange (Optional) 

The parties exchange public keys {Ki,Kr), and secret nonces {ki^R,kR^i) 
encrypted in those keys. These nonces are combined with the current shared- 
secret to make high quality symmetric secret keys {Kui, Kur) to be used in 
current and future Identity Exchanges. 



Initiator three byte value 
Responder three byte value 



( 5 ) 

( 6 ) 



I ^R -. Mir, 6 , PSILTi, PSh, Ekp-{v'{, Kr Mp')) 
I^R : Mir, 5, PSILTr, PSIr, , Xr Kr, Mp'^) 
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Mp” = Initiator message padding 
Mp'^ = Responder message padding 
PSIj = Initiator Party Secret Index 
PSIr = Responder Party Secret Index 
PSILTj = Initiator PSI LifeTime 
PSILTr = Responder PSI LifeTime 

< = MACkv'^{Mir, 6, PSILTi, PSh, Kr Mp’i, VVi) 

< = MACkv'^{Mir, 5, PSILTr, PSIr, Xj, Kr, Mp'i^, VVr) 



Kp'i = H{g\ gy, Mjr, 6, PSILTj, PSIj, [g^y]) 
Kp'i^ = H{gy, g-, Mjr, 5, PSILTr, PSIr, [g^y]) 
Kv'l = H{g^y) 

Kv'^ = H{v'',g-y) 



Initiator privacy key 
Responder privacy key 
Initiator verification key 
Responder verification key 



Kj = Initiator public key 
Kr = Responder public key 

Uj = PSIj\\v'^ Initiator party symmetric identity 

Ur = PSIr\\v'I^\\PSIj Responder party symmetric identity 

Xj = Exiikj^R) 

Xr = EKjiikR^j) 



Kuj = H{Mjr, 6, PSILTj, PSIj, 5, PSILTr, PSIr, kj^R, kR^j, g^y) 
Kur = H{Mjr, 5, PSILTr, PSIr, 6, PSILTj, PSIj, kR^j, kj^R, g^y) 



A. 4 Identity Exchange 

The parties verify their identities by proving knowledge of the symmetric secrets, 
and select attributes {Asj,Asr) for the generated session-keys {Ksj, Ksr). 

(7) I ^R : Mjr, 4, SPILTj, SPIj, Ekj>'^{Xr, v'j, Asj, Mp'j) 

(8) I^R -. Mjr, 7, SPILTr, SPIr, Ekp'^{Ur, v'r, Asr, Mp'jj) 

Asj = Initiator list of attributes selected from Aor 
Asr = Responder list of attributes selected from Aoj 
Mp'j = Initiator message padding 

= Responder message padding 
SPIj = Initiator Security Parameters Index 
SPIr = Responder Security Parameters Index 
SPILTj = Initiator SPI LifeTime 
SPILTr = Responder SPI LifeTime 
v'j = MACkv',{Mjr, 4, SPILTj, SPIj, Uj, Asj, Mp'j, VVj) 
v'jj = MACkv’^{Mjr, 7, SPILTr, SPIr, Ur, v'j, Asr, Mp'jj, VVr) 

Kp'j = H{g''', gy, Mjr, 4, SPILTj, SPIj, [ 5 ®^]) Initiator privacy key 

Kp'jj = H{gy, 5 ®, Mjr, 7 , SPILTr, SPIr, [ 5 ®^]) Responder privacy key 

Kv'j = H{Kuj, g'^y) Initiator verification key 

Kv'jj = H{Kur, g'^y) Responder verification key 
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Ksi = H{Mir, Kui, Kur, v'j, [g^y]) 
Ksr = H{Mir, Kur, Kui, v'r, [g^y]) 



Initiator session key 
Responder session key 



A. 5 SPI Messages (Optional) 

Either party may request another set of attributes at a later time, or provide 
another session- key (As„) to quickly replace one that is expiring. 



Asn = Needed list of attributes 
Asu = Update list of attributes 
Mpn = Needed message padding 
Mpu = Update message padding 
spin = zero (reserved) 

SPIu = Update Security Parameters Index 
SPILTn = non-zero random 
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SPILTn = Update SPI LifeTime 

Vn = MACkv'j{Mir, 8, SPILTn, spin, v'j , As„, Mpn) 
Vu = MACxv'JyMiR, 9, SPILTn, SPIn, v'j, Asn, Mpn) 



Kpn = H{g\gy, Mir, 8, SPILTn, SPIn, 

Kpn = H{gy, g\ Mir, 9, SPILTn, SPIn, [ 5 ""^]) 



Needed privacy key 
Update privacy key 



Ksn = H{Mir, Kur, Kui, Vn, [ 5 "^]) 



Update session key 
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